web-scraping and rvest help

Hi, I want to transfer products on my wholesaler's site, I have been working for a month, I want to download products and transfer them to exell, if you help, I would really appreciate my codes as follows

#Loading the rvest package
library('rvest')

#Specifying the url for desired website to be scraped
url <- 'https://www.vadibilisim.com/cantalar'

#Reading the HTML code from the website
webpage <- read_html(url)
#Using CSS selectors to scrape the rankings section
rank_data_html <- html_nodes(webpage,'.urunBaslik')

#Converting the ranking data to text
rank_data <- html_text(rank_data_html)

#Let's have a look at the rankings
head(rank_data)

#Data-Preprocessing: Converting rankings to numerical
rank_data<-as.numeric(rank_data)

#Let's have another look at the rankings
head(rank_data)

#Using CSS selectors to scrape the title section
title_data_html <- html_nodes(webpage,'#adetsecim')

#Converting the title data to text
title_data <- html_text(title_data_html)

#Let's have a look at the title
head(title_data)

For starters, the rvest package site has some nice tutorials to help you get started. For example, the SelectorGadget article has nice tips on how to pull from a page the content you'd like.

You're also going to want to get into pagination with rvest. Thankfully your website has a pretty easy page scheme (https://www.vadibilisim.com/cantalar/{page_number}), so you can just run a loop to go through all the pages.

That tutorial has tips like how to pull text with html_text.

library('rvest')
url <- 'https://www.vadibilisim.com/cantalar'
webpage <- read_html(url)

# get product names on page
product_names <- html_nodes(webpage,'.urunBaslik') %>% 
  html_text()
product_names
#>  [1] "Addison ADS-230 Siyah Kab..." "Addison 300448 15.6 Laciv..."
#>  [3] "Addison 300448 15.6 Pembe..." "Addison 300448 15.6 Gri N..."
#>  [5] "Valja VL-241 Noktalı Bey..."  "Addison 300458 18 Siyah G..."
#>  [7] "Addison 300682 15.6 Laciv..." "Addison 301008 15.6 Siyah..."
#>  [9] "Addison 301004 15.6 Kamuf..." "Addison 301003 15.6 Koyu ..."
#> [11] "Addison 301005 15.6 Laciv..." "Addison 300492 15.6-16 Si..."
#> [13] "Addison ADS-202 Gümüş ..."    "Addison 300998 15.6 Gri/S..."
#> [15] "Addison 301000 14 Gri/Siy..." "Addison 300997 15.6 Gri/S..."
#> [17] "Addison 300993 15.6 Koyu ..." "Addison ST-490 Siyah/Gri ..."
#> [19] "Addison 300494 15.6 Bordo..." "Addison 300441 15.6 Kamuf..."

#get product links on page
html_nodes(webpage,'.urunBaslik') %>% 
  html_nodes('a') %>% 
  html_attr('href')
#>  [1] "/addison-ads-230-siyah-kabin-boy-valiz"                                      
#>  [2] "/addison-300448-15-6-lacivert-notebook-sirt-cantasi"                         
#>  [3] "/addison-300448-15-6-pembe-notebook-sirt-cantasi"                            
#>  [4] "/addison-300448-15-6-gri-notebook-sirt-cantasi"                              
#>  [5] "/valja-vl-241-noktali-beyaz-prestige-serisi-pvc-orta-boy-valiz"              
#>  [6] "/addison-300458-18-siyah-gaming-bilgisayar-notebook-cantasi"                 
#>  [7] "/addison-300682-15-6-lacivert-bilgisayar-notebook-cantasi"                   
#>  [8] "/addison-301008-15-6-siyah-notebook-sirt-cantasi"                            
#>  [9] "/addison-301004-15-6-kamuflaj-notebook-sirt-cantasi"                         
#> [10] "/addison-301003-15-6-koyu-gri-bilgisayar-notebook-cantasi"                   
#> [11] "/addison-301005-15-6-lacivert-notebook-sirt-cantasi"                         
#> [12] "/addison-300492-15-6-16-siyah-bilgisayar-notebook-sirt-cantasi"              
#> [13] "/addison-ads-202-gumus-kabin-boy-pilot-valiz"                                
#> [14] "/addison-300998-15-6-gri-siyah-bilgisayar-notebook-cantasi"                  
#> [15] "/addison-301000-14-gri-siyah-notebook-sirt-cantasi"                          
#> [16] "/addison-300997-15-6-gri-siyah-notebook-sirt-cantasi"                        
#> [17] "/addison-300993-15-6-koyu-gri-bilgisayar-notebook-sirt-cantasi"              
#> [18] "/addison-st-490-siyah-gri-gizli-fermuarli-bilgisayar-notebook-sirt-cantasi"  
#> [19] "/addison-300494-15-6-bordo-sirt-cantasi"                                     
#> [20] "/addison-300441-15-6-kamuflaj-desenli-sport-bilgisayar-notebook-sirt-cantasi"

Created on 2021-03-03 by the reprex package (v1.0.0)


It'd be helpful to be explicit about what you are trying to collect and the trouble you're having.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.