comp-4800-web-crawler

This program will generate an undirected graph similar to web-Google. Given any starting website, the program will parse any links on the website, and reursively find more websites by visiting the pased links.

NOTE: Be careful with this program, it send GET requests to the parsed websites. If you send too many requests to the same website, they may block your IP address.

How to run

Make a virtual environment:

python -m venv venv

Activate:

source venv/bin/activate

Install dependencies

pip install -r reqirements.txt

Run the program, giving a starting website.

python main.py jagrajaulakh.com

View the outputted graph:

cat graph.txt
Description
No description provided
Readme 29 KiB
Languages
Python 100%