Похожие чаты

@Romshark Thirteen Thank you The challenge isn't parallelizing. The problem is

the difference between different websites!

For example, consider all of my websites are eCommerce and I want to grab the product price from these websites.
So what should I do?

Of course, Amazon has its own template, and Alibaba has its own! So I have to make it clear for my scrapper (the CSS selectors or regex or json unmarshal or...)

Now my current idea is: Add a function for each website, which accepts HTML doc and returns expected information from the website (for example product price)
But as I said, I don't think that it's a good plan!

Bcuz I have to add many functions to my program!
And when there is a new website, I should stop my program, then add a new function for a new website, build my program and run again!
Each time a new website needs to be added I should rebuild my program!

2 ответов

12 просмотров

or you could spend 10 years building a machine learning model which automatically scrapes websites no matter what layout it uses why build something for 30 minutes if you can automate it for 10 hours? 😂

Пользователь-61931 Автор вопроса

Any idea?

Похожие вопросы

Обсуждают сегодня

Господа, а что сейчас вообще с рынком труда на делфи происходит? Какова ситуация?
Rꙮman Yankꙮvsky
29
А вообще, что может смущать в самой Julia - бы сказал, что нет единого стандартного подхода по многим моментам, поэтому многое выглядит как "хаки" и произвол. Короче говоря, с...
Viktor G.
2
@Benzenoid can you tell me the easiest, and safest way to bu.y HEX now?
Živa Žena
20
This is a question from my wife who make a fortune with memes 😂😂 About the Migration and Tokens: 1. How will the old tokens be migrated to the new $LGCYX network? What is th...
🍿 °anton°
2
30500 за редактор? )
Владимир
47
а через ESC-код ?
Alexey Kulakov
29
What is the Dex situation? Agora team started with the Pnetwork for their dex which helped them both with integration. It’s completed but as you can see from the Pnetwork ann...
Ben
1
Гайс, вопрос для разносторонее развитых: читаю стрим с юарта, нада выделять с него фреймы с определенной структурой, если ли чо готовое, или долбаться с ринг буффером? нада у...
Vitaly
9
Anyone knows where there are some instructions or discort about failed bridge transactions ?
Jochem
21
@lozuk how do I get my phex copies of my ehex from a atomic wallet, to move to my rabby?
Justfrontin 👀
11
Карта сайта