Webscraping from list of values in Table

Hi all,

I have a list of 77 patent unique IDs (e.g. US03082024A1, EN03082019B2 etc., n= 77).

I want to use R to automate the task of searching Google Patents (Url: https://patents.google.com/) and then pull the following data (patent classification code, application year, patent title, company and abstract) for each unique patent by ID.

The resultant file would be saved as a CSV with the column names identical to the data parameters above.

Many thanks!

1 Like

Hi @mchina1,
Im try to help but only can get the abstract for US03082024A1. The EN03082019B2 is not a valid patent id in this page.

Im use rvest. I have problems for get the correct nodes for others items. Im sure that an advanced R user could help better.

library(rvest)

link <- 'https://patents.google.com/patent/US20030082024A1/en?oq=US03082024A1'

url_data1 <- link |> 
  read_html() |> 
  html_nodes(xpath='//*[@id="A-0001"]') |> 
  html_text() 

> url_data1
[1] "A cargo bar having reduced costs due in part to being constructed from square tubes and due to being 
collapsible to a length that fits a 4 foot pallet so as to facilitate shipping and storage. Pressure induced 
extension of the cargo bar against opposed truck walls is provided by a rack and pinion gear arrangement, the 
rack teeth provided on a first tube wall and the pinion teeth provided on a pivotal lever mounted to a second 
tube. The bar ends have pressure pads that will conform to side walls of a truck or van and the tube interior is 
alternately fitted with retractable track pins that extend through the pads and retract behind the pads to 
accommodate different cargo bar systems. "

Could you provide the other patent id for try to download all abstract?

1 Like

Hi — thanks a lot!

The code above seems like it would certainly work.

I have attached two patents to trial:
— [US10952730B2]
— [EP3155984B1]

I was just wondering if there would be a way to have a CSV file with a list of these patent IDs and pull the data above i.e patent title, abstract, application year?

Many many thanks

Hi @mchina1,
This patent dont have abstract, my script only download the abstract.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.