Shaybib
Power Member
- Joined
- Mar 23, 2018
- Messages
- 786
- Reaction score
- 225
I have tried using Pegasus. It seems it mostly just change words order. If google tokinize the post it will clearly be duplicate sentences.
Google doesn't tokenize it because that's a shit ton of processing power requiredI have tried using Pegasus. It seems it mostly just change words order. If google tokinize the post it will clearly be duplicate sentences.
I will set up a simple script for you guys later this week so you got some place to start.
- Selenium and Helium if I need to automate a browser
- Nider for writing on images
- Pillow for image manipulation
- sentence_transformers, torch for ML
- spacy for NLP
- python-unsplash
- For uploading to WP REST API I'm just using requests/json/base64
Python 3.6 works the best with these, above there's gonna be dependency hell.
Just divide the project into small steps. For example:
I will set up a simple script for you guys later this week so you got some place to start.
- Take a keyword.
- Scrape top 10 results from Google
- Get something from these sites. Combine into an article. I used to extract contents using Newspaper.
- Paraphrase. Start with this: https://huggingface.co/tuner007/pegasus_paraphrase
btw openai detector easily catches pegasus contents most of the time and it does not take much time to process it out. you can check it urself.Google doesn't tokenize it because that's a shit ton of processing power required
He has to in order to understand what the post is about and to find all the entitiesGoogle doesn't tokenize it because that's a shit ton of processing power required
Congrats on urs success. Amazing journey.Get something from these sites. Combine into an article. I used to extract contents using Newspaper.
no point, just put it on a server and post using WordPress REST API. easy AF.This is very interesting. I have been working on a similar program that does hugo pages.
Trying to get it running In a docker container so I just put one container behind a domain and scale to a few domains.
I got two rtx 3080...which gpu model are you using ?
i have gtx 1660 ti in my laptop is it enough ? also there is r9 280x sapphire in my desktop .
could it be better with multiple gpus like those bitcoin miner rigs ?
GPT is garbage. We use own stuff/other services.do you connect your tool with some AI Generator such as Jarvis?
Or did you create your own AI generator?
100% true.Google doesn't tokenize it because that's a shit ton of processing power required
sorry I've been too busy. Real life got in the way too much (kids and stuff)I will set up a simple script for you guys later this week so you got some place to start.
is there any update about that ?
btw openai detector easily catches pegasus contents most of the time and it does not take much time to process it out. you can check it urself.
What I observed, google still does not care about AI generated content but Bing deindex my posts created by pegasus more often. I did the test on my 2 years old website, posted 3 articles - all processed by pegasus, parrot paraphraser and Bing deindexed my whole website in less than 10 days. out of 3 , 2 posts were ranked in top 3. May be coincidence....
Google cannot afford the computational power to do that. I'm 100% positive of that. Google "new york seo expert" and check what's #1. Almost 3 months up. (hint: it's lorem ipsum + the phrase "New York SEO Expert" inbetween + black hat links).He has to in order to understand what the post is about and to find all the entities
Instead or proxies I use a SERP API. I don't want to advertise any company here, you can Google it, there's a dozen to choose from. Costs me $10-20 to get enough scraping done to research and construct a big website. So nice not having to play around with proxies and handle errors.What proxy service(s) is recommended when scraping Google in bulk? I seem to get endless captcha's already changing user agent.
recently I found out a fantatic and super fast Python library called yake, which I use to extract keywords from paragraphs/articles/long-tails and compare them with keywords and construct articles accordingly. That's what we use now. Used machine learning before, very little difference. https://pypi.org/project/yake/Congrats on urs success. Amazing journey.
Can I ask how u combine the scraped content into an article?
How to make the combined article perfect before moving to paraphrase?
Can ur bot automatically add h2, h3 headlines into the combined article?
Thanks
I could and I even have a value "posted" in my database, but I don't even bother because is that dumb right now.thanks for answering my asking!
One more thing I am a bit confused is
how you manage duplicate things, with so many kws and topics??
how can u ensure each post is unique and u never post it before?
I just feel like its hard to avoid if too many posts rolled in
thank you
Which language did you use ? Python or c# ? Looks good as long as it finishes the job. GLHey folks,
I have a background in computer science. I already own several profitable content websites (but nothing crazy), and I'm tired of creating/outsourcing content.
I've created a simple app in Python that goes through the top results on Google for a given keyword, takes a paragraph from each for semantically relevant keywords, constructs a new article out of it, and paraphrases it using an AI tool. It also generates related images, adds nice formatting, and schema (I'm using FAQ schema a lot for PAA keywords).
I'm using WordPress on Linode with Centminmod. Posting using the REST API.
In the nearest future I'm launching 3 sites:
Attaching a sneak peek of my app. I will show you an example article in my next update.
- My passion hobby project - I will generate articles using AI, but edit them manually - So far this one is up with 3 articles, started yesterday.
- A big site where I will drip tens of thousands of posts without editing and try to monetize with display ads.
- Another fully-automated site that will target local keywords for lead generation.
Wish me luck!
Python!Which language did you use ? Python or c# ? Looks good as long as it finishes the job. GL
am also very confused of this part, bro,goes through the top results on Google for a given keyword, takes a paragraph from each for semantically relevant keywords, constructs a new article out of it
SERP API
If that's expensive for you, use thisi checked serp api sites but they are more expensive than you said. and limitation is on the limit. 5.000 search 50 usd how did you do it with much cheaper ?
1 search means 1 keyword right ? then just 5.000 keyword you need to pay 50 usd. i saw another thread guy had 400-500k long tail keyword thats end up more than 5000 - 6000 usd
Which library do you used? Is it https://github.com/MacKey-255/ https://github.com/MacKey-255/GoodByeCatpcha ?If that's expensive for you, use this
https://github.com/topics/recaptcha-solver-pythonIt resolves reCaptcha from Google using voice recognition. Takes 2-3 minutes to implement. Works like a charm for me since the last 2 months.
That last line is very true. I thought coding is not a required skill for internet business. I realized it's true power only when it helped me complete big tasks easily. From then, I started to focus more on learning instead of searching for ways to make money online. Now, I conclude that a great success with internet business is not possible without Python.this is a question that would require a very complex answer. Hundreds lines of code and pretty advanced algorithms and usage of NLP libraries like Spacy.
absolutely cannot. I'm only sharing it with a tiny network of business partners. Some methods that I'm using could be saturated and killed within days.
The project is going VERY well. It's running on 7 domains now and the oldest site just crossed 10k monthly clicks with 4000+ articles. Google is still kinda as dumb as 15 years ago.
Btw - Google "new york seo expert" and check out the site that is hosted on Google pages. And tell me Google has a grasp on anything right now. It's a joke...
There is huge potential here. I'm building this app to be something really big, it does keyword research, competition analysis, generating articles, images, auto-posting to WordPress. 100% I'm never selling it.
Learning Python - best decision I've ever made.