Controlling a web browser from a program can be useful in many scenarios, example use cases are website text automation and web scraping, a very popular framework for this kind of automation is Selenium WebDriver.
Selenium WebDriver is a browser-controlling library, it supports all major browsers (Firefox, Edge, Chrome, Safari, Opera, etc.) and is available for different programming languages including Python. In this tutorial, we will be using its Python bindings to automate login to websites.
Automating the login process to a website proves to be handy. For example, you may want to edit your account settings automatically, or you want to extract some information that requires login, etc.
First, let's install Selenium for Python:
pip3 install selenium
To make things concrete, I'll be using the Github login page to demonstrate how you can automatically log in using Selenium.
Open up a new Python script and initialize the WebDriver:
from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait # Github credentials username = "username" password = "password" # initialize the Chrome driver driver = webdriver.Chrome("chromedriver")
After you downloaded and unzipped the driver for your OS, put it in your current directory or in a known path, so you can pass it to
webdriver.Chrome() class. In my case, chromedriver.exe is in the current directory, so I simply pass its name to the constructor.
Since we're interested in automating Github login, we'll navigate to Github login page and inspect the page to identify its HTML elements:
id of the login and password input fields, and the name of the Sign-in button will be useful for us to retrieve these elements in code and insert to it programmatically.
Notice the username/email address input field has
id, where the password input field has the
password, see also the submit button has the
commit, the below code goes to the Github login page, extracts these elements, fills the credentials, and clicks the button:
# head to github login page driver.get("https://github.com/login") # find username/email field and send the username itself to the input field driver.find_element_by_id("login_field").send_keys(username) # find password input field and insert password as well driver.find_element_by_id("password").send_keys(password) # click login button driver.find_element_by_name("commit").click()
find_element_by_id() function retrieves an HTML element by its
id, and the
send_keys() method simulates keypresses, the above code cell will make Chrome type in the email and the password, and then click the Sign in button.
The next thing to do is to determine whether our login was successful, there are a lot of ways to detect that, but in this tutorial, we'll do it by detecting the shown errors upon login (of course, this will change from a website to another).
The above image shows what happens when we insert wrong credentials, you'll see a new HTML
div element with the class
"flash-error" that has the text of "Incorrect username or password.".
The below code is responsible for waiting for the page to be loaded after the login is performed using
WebDriverWait(), and checks for the error:
# wait the ready state to be complete WebDriverWait(driver=driver, timeout=10).until( lambda x: x.execute_script("return document.readyState === 'complete'") ) error_message = "Incorrect username or password." # get the errors (if there are) errors = driver.find_elements_by_class_name("flash-error") # print the errors optionally # for e in errors: # print(e.text) # if we find that error message within errors, then login is failed if any(error_message in e.text for e in errors): print("[!] Login failed") else: print("[+] Login successful")
We use WebDriverWait to wait until the document finished loading, the
return document.readyState === 'complete' returns
True when the page is loaded, and
Finally, we close our driver:
# close the driver driver.close()
Alright, now you have the skill to log in automatically to the website of your choice, note that Github will block you when you run the script multiple times with wrong credentials, so be aware of that.
Now you can do the thing you want to do after you log in using your account, you can add the code in the line where we're printing 'Login successful'.
Also, if you've successfully logged in using your real account, you may encounter email confirmation, to bypass that, you have to read your email programmatically with Python and extracts the confirmation code, and insert it in real-time using Selenium, great challenge, isn't it? Good luck with it!
Note that the login process will differ from one website to another, the goal of this tutorial is to give you the essential skills to automate the login of your target website. If you want to advance your web scraping skills, the below courses will definitely be valuable for you, check them out:
Check the full code here.
Learn also: How to Use Github API in Python.
Happy Automating ♥View Full Code