# comp-4800-web-crawler This program will generate an undirected graph similar to web-Google. Given any starting website, the program will parse any links on the website, and reursively find more websites by visiting the pased links. **NOTE: Be careful with this program, it send GET requests to the parsed websites. If you send too many requests to the same website, they may block your IP address.** # How to run Make a virtual environment: ```bash python -m venv venv ``` Activate: ```bash source venv/bin/activate ``` Install dependencies ```bash pip install -r reqirements.txt ``` Run the program, giving a starting website. ```bash python main.py jagrajaulakh.com ``` View the outputted graph: ```bash cat graph.txt ``` # TODO We can use `pyppeteer` or `playwright` to parse dynamically rendered websites. [Link to article](https://scrapingant.com/blog/scrape-dynamic-website-with-python)