Scrape-O-matic: January 2015

Tuesday 20 January 2015

Image Scraper

How to:

To start out you just add one or more url's to the top box, alternatively you can load a list of url's from a file, Then you click start and wait for the scraped image links to appear in the bottom box. Note that there might be a small lag in the links that are scraped and the bottom box updating.

Flaws:

The bottom box that shows all the scraped image links has a small flaw. You cannot directly copy the links form it, ctrl + c won't work. You will have to save to a file and go from there. This program also doesn't have a option to download and save images apart from the preview option. The reason it doesn't mass download images and only scrape image links is because there are plenty download managers that are made to be reliable. They will work a lot faster for mass download than anything a could slap together for a added features. I recommend using orbit which i usually use but i am sure there are plenty of other good download managers.
You can import your list of Image links with orbit by clicking File - Import list of downloads. Orbit also allows you to set how many to download at once and download speed.

Preview box:

The preview box shows a preview of the image link you are clicking on. You can save the image in the preview box by clicking on it. So far it can only save bmp, gif, jpeg, png, tiff and wmf but the scraper still scrapes all image formats.

Bug:

None that i know off, Please post any that you find here and i will try to fix them.

Requirements:

.NET Framework 3.5

Download:

- ImageScraper-1.0

If you have any thing you would like my to add or change feel free to ask and i will consider doing so.

Monday 19 January 2015

Indexer and Scraper

Indexer

Scraper

Indexer

This indexer index's a page and all their subpages. It can scrapes Links, images links, video links, iframe links, embed links.

mywebsite.com > forum > sign up, profile, login, logout
> products > downloads
> contact us

Match Text

Only scrape urls that contains the text. If you only want to scrape http://mywebsite.com you should use "mywebsite.com" as the match text.

Scraper

This scraper takes a list of urls and returns links, image links.

If you find any problems please post them here and i will try to fix it. Feel free to make suggestion and ask me to add functionality.

Requirements:

.NET Framework 4.5

Download

- Indexer & Scraper-1.0

Monday 12 January 2015

Simple Proxy Scraper Python Script

Update:
New! (1/9/2015) A new/updated Proxy scraper Python script is available here.
(12/1/2015) A new Proxy scraper Python script is available here.

Hi guys,

I made a python proxy scraping script for linux vps. The script should run fine on
any linux system, most linux systems come standard with python. If you want to run the script on windows, download and install python 2.7 not python 3.

The scripts grabs about 120 proxies and saves them to a file named "proxies<date+time>.txt"

Proxies are grabbed from:
http://proxy-list.org
http://www.samair.ru
http://spys.ru/
http://nntime.com/
http://proxy-ip-list.com/

The script was made to be dynamic and scrape any proxy site where the proxies are
listed as "xxx.xxx.xxx.xxx:xxxxx" without any thing between the ip and the port in the source. You should be able to add other proxy sites to the script if proxies are listed as such. You can add extra sites to the script by adding it to the "urls" varible under #PROGRAM.

Feel free to ask for help.
Is there any demand for python, linux programs or bots? I prefer making stuff for linux over windows.

Requirements
-Python 2.7

Download
- ScraperScript-1.0