Update README with setup instructions and added requirements.txt

This commit is contained in:
2023-03-12 14:30:55 -04:00
parent 4040c52114
commit 4597d8c775
3 changed files with 44 additions and 0 deletions

View File

@@ -1,2 +1,40 @@
# comp-4800-web-crawler
This program will generate an undirected graph similar to web-Google. Given any starting
website, the program will parse any links on the website, and reursively find more
websites by visiting the pased links.
**NOTE: Be careful with this program, it send GET requests to the parsed websites. If you
send too many requests to the same website, they may block your IP address.**
# How to run
Make a virtual environment:
```bash
python -m venv venv
```
Activate:
```bash
source venv/bin/activate
```
Install dependencies
```bash
pip install -r reqirements.txt
```
Run the program, giving a starting website.
```bash
python main.py jagrajaulakh.com
```
View the outputted graph:
```bash
cat graph.txt
```