Monday 12 January 2015

Simple Proxy Scraper Python Script

Update:
New! (1/9/2015) A new/updated Proxy scraper Python script is available here.
(12/1/2015) A new Proxy scraper Python script is available here.

Hi guys,

I made a python proxy scraping script for linux vps. The script should run fine on
any linux system, most linux systems come standard with python. If you want to run the script on windows, download and install python 2.7 not python 3.

The scripts grabs about 120 proxies and saves them to a file named "proxies<date+time>.txt"

Proxies are grabbed from:
http://proxy-list.org
http://www.samair.ru
http://spys.ru/
http://nntime.com/
http://proxy-ip-list.com/

The script was made to be dynamic and scrape any proxy site where the proxies are
listed as "xxx.xxx.xxx.xxx:xxxxx" without any thing between the ip and the port in the source. You should be able to add other proxy sites to the script if proxies are listed as such. You can add extra sites to the script by adding it to the "urls" varible under #PROGRAM.

Feel free to ask for help.
Is there any demand for python, linux programs or bots? I prefer making stuff for linux over windows.

Requirements
  -Python 2.7

Download
  - ScraperScript-1.0

8 comments:

  1. Hi,
    I can scrap 500 only. I want to scrap all. Kindly help me to add url in pages for nntime and other site..

    ReplyDelete
    Replies
    1. Hi, This is a pretty old script.
      You can find a new proxy scraper script at http://scrapeomatic.blogspot.com/2015/09/update-proxy-scraper-script-v11python.html. I uploaded it today and you should be able to get about +3200 proxies with this new script. Enjoy!

      Thanks for commenting.

      Delete
  2. Thanks.. Do you have proxy checking script. It must split L1, L2, L3 and socks. Can you help me.

    ReplyDelete
    Replies
    1. Sorry i don't have anything like that but if you give me a day i can try to slap something together. How do you want to test the proxies? Do you want to test the proxy by using it to go to a website or just check if the ip is alive? How do you want the output? In .db format, .csv format, or in separate txt file. ie L1 proxies in one file, L2 in an another.

      Delete
    2. Hi,

      I want to test URL and L1, L2, L3 and socks in different TXT file. It has to be in order like speed of proxy Is it possible ?

      Delete
    3. This comment has been removed by the author.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete