0% found this document useful (0 votes)
41 views6 pages

Crawler Tutorial (Video Transcript)

This SOP provides a step-by-step guide for installing and setting up web crawling software to extract emails from websites. Key steps include downloading the software, installing Python, extracting files, and running the crawler through Command Prompt. Additional tips for efficiency and important notes are also included to ensure a smooth setup process.

Uploaded by

Suryanshu Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views6 pages

Crawler Tutorial (Video Transcript)

This SOP provides a step-by-step guide for installing and setting up web crawling software to extract emails from websites. Key steps include downloading the software, installing Python, extracting files, and running the crawler through Command Prompt. Additional tips for efficiency and important notes are also included to ensure a smooth setup process.

Uploaded by

Suryanshu Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Web Crawling Software Setup SOP

Objective

This SOP outlines the steps to install and set up the web crawling software
for extracting emails from specified websites.

Key Steps

1. Download the Web Crawler Zip File 0:15

 Download the zip file to your computer.

2. Install Python 0:44


 Go to Google and search for 'download Python'.
 Click on the yellow button to download Python 3.13
 Run the installer and click 'Next' through the installation prompts.
 Important: Ensure to check the box that says 'Add to PATH' during
installation.

3. Extract the Web Crawler Files 2:04

 Navigate to your Downloads folder.


 Right-click on the web crawler zip file and select 'Extract All'.
 Click 'Extract' to create a folder with the extracted files.

4. Open Command Prompt 3:11


 In the extracted folder, click on the address bar and copy the address.
 Then, go on search option at the bottom and type 'cmd' to open
Command Prompt.

5. Navigate to the Web Crawler Directory 3:23

 In Command Prompt, type 'cd ' followed by the path of the extracted
folder (paste it) and press Enter.

6. Install Required Modules 3:44


 In Command Prompt, type 'pip install -r requirements.txt' and press
Enter.
 Wait for the installation of modules to complete.

7. Verify Python Installation 4:44

 Type 'python --version' in Command Prompt to check if Python is


installed correctly.
 Ensure it shows a valid version number.

8. Run the Web Crawler 5:21


 In Command Prompt, type 'python main.py '. click enter. Then type the
websites you want to crawl, separated by commas.
 Press Enter to start the crawling process.

9. Access the Results 6:21

 After the crawling is finished, locate the generated Excel file in the
same folder as the web crawler.
 Open the Excel file to view the crawled websites and corresponding
emails.

Tips for Efficiency

 Keep your web crawler files organized in a dedicated folder for easy
access.
 Regularly update Python and the required modules to avoid
compatibility issues.

Link to Loom

https://loom.com/share/16f7f6a58c25422eb1d034bc003b96f7

Important Points to Note:

1. You can always visit the web crawler folder to get access of the python
files.
2. pip install -r requirements.txt is only a one time task. For the next time,
you can directly run python main.py.
3. Please make sure always that in the command prompt you have
changed the original path to the path of the folder you are in.

HAPPY CRAWLING!!

You might also like