Tips for scraping LinkedIn public data at scale?

jamie3000

Jr. VIP
Jr. VIP
Joined
Jun 30, 2014
Messages
5,126
Reaction score
3,677
Does anyone have any tips for scraping LinkedIn public profiles?

I can get all the profile URLs fine form Google but the LinkedIn auth wall triggers really quickly even with my 100% clean residential IP. Also I can beat the bot verification automatically but it seems to hardly trigger and just gives me the auth wall.

So far I've looked at

Trying to pull the profiles from search engine caches but they are all no archive tagged
Pretending to be Google with google bot user agent and google crawler IP sent in the forwarding headers
Rotating user agents/cookies etc etc

I suppose the best route would be to make 100s of profiles and use some rotating 4G proxies and scrape while logged in?

So...anyone got any tips on scraping LinkedIn?
 

Algo

Jr. VIP
Jr. VIP
Joined
Mar 27, 2020
Messages
450
Reaction score
248
I suppose the best route would be to make 100s of profiles and use some rotating 4G proxies and scrape while logged in?
You can use the solution you mentioned.

Or scrape Sales Nav with your research criteria to accomplish this.
 

Ashk881

Power Member
Joined
Aug 24, 2021
Messages
631
Reaction score
802
I can get all the profile URLs fine form Google but the LinkedIn auth wall triggers really quickly even with my 100% clean residential IP. Also I can beat the bot verification automatically but it seems to hardly trigger and just gives me the auth wall.
The authwall isn't their to prevent scraping. It's normal LinkedIn behaviour, they show an authwall to human users too when trying to access an account through serps or direct links.

I suppose the best route would be to make 100s of profiles and use some rotating 4G proxies and scrape while logged in?
Yep.
 

imccafey

Jr. VIP
Jr. VIP
Joined
Aug 24, 2020
Messages
886
Reaction score
518
Just as mentioned above, the authwall is pretty typical even to casual LinkedIn users following direct links and such. Interestingly, I wondered if LinkedIn was sending a cookie that they use to track the number of profile views or if there was tracking from the server-side. As far as I can tell, clearing cache doesn't really affect it so there's that.
 

yellowcat

Regular Member
Joined
Aug 27, 2015
Messages
379
Reaction score
261
Check out browser addons specifically for LinkedIn n reverse those addon apis or bot them.
Clever search engine dorks.. better off hitting other resources who have aggregated data vs trying to directly scrape LinkedIn
 
Top