Похожие чаты

What library do you guys use in Go for web

automation / web scraping?

Sometimes a library such as go-colly won't suffice because you might need to click some buttons before reaching the webpage to be scraped, as in selecting the language of the website and accepting cookies upon opening it.

I found go-rod and chromedp, but they seem to lack behind in terms of issue resolution and available features compared with libraries such as Puppeteer.

4 ответов

11 просмотров

I manually copy http requests and implement a web API for the target

Pedro-Aguiar Автор вопроса

I manually copy http requests and implement a web ...

What if the website in question doesn't allow you to skip those steps by passing parameters to the HTTP requests? Idk what this is called, or whether it's some sort of obfuscation technique, but websites such as StockX ask the user to pick a language upon accessing the website, and after you do, regardless of what language you chose, the URL remains the same.

Pedro Aguiar
What if the website in question doesn't allow you ...

No matter what, if you copy http requests step by step from browser's network tab or a tool like Fiddler, and implement those steps in a programming language, you'd be able to scrape content and even submit forms. There's one challenging case though and that is with websites protected by captcha. Some people bypass that as well using OCRs or AI. If the website is asking you to choose the language, it means it sending an http request to set your language. That's the first request to implement.

Pedro-Aguiar Автор вопроса
Pedro Aguiar
What if the website in question doesn't allow you ...

For anyone wondering how I solved this, @ali_error (thanks again!) dead right: checking for hidden API is far more effective than scraping data from the frontend. Both of the websites I had to work with in that project have a hidden API that can be consumed as long as you have a Cookie, which is a game changer. The following video encapsulates the idea of what they were referring to by their answer. [1] https://www.youtube.com/watch?v=G7s0eGOaRPE

Похожие вопросы

Обсуждают сегодня

а зачем этот вопрос для удаления из чата?
Mёdkinson Medvezhkin
63
Добрый день. Хочу сделать отрисовку по команде на панели. Почему-то рисуется только при втором вызове. С чем может быть связано, не подскажете? procedure TForm1.FormDblClick(...
Kirill Filippenok
20
Есть рассчет для Таллинна? https://x.com/dr_enderlin/status/1784581592003850496
Vladimir Ivanov
19
Any dog on Fantom ?
Bitcoin Magic
19
Интересно, а в чём прикол мак оси?
Лисицка
16
Всем доброго дня! Подскажите может кто использовал связку Pagebuilder + Clientsetting. Сами параметры с типом pagebuilder в модуле Clientsetting работают нормально, можно такж...
Александр Добриков
12
using next image component with s3 image: "url" parameter is valid but upstream response is invalid code: <Image fill src={s...
Fasil
3
Добрый день Как поставить vscode из флэтпака как дефолтный редактор в filezilla?
Daniel
16
Tax his land, Tax his bed, Tax the table At which he's fed. Tax his work, Tax his pay, He works for peanuts Anyway! Tax his cow, Tax his goat, Tax his pants, Tax his coat....
Forge
5
А почему в си некоторые вещи работают с двойными кавычками некоторые с одинарными? Нельзя было все сделать с одними или чтоб работало с разными? например чтоб выводить строки ...
.
15
Карта сайта