How to Use Free Proxies With Selenium or Puppeteer

How to Use Free Proxies With Selenium or Puppeteer

Understanding Proxies in Web Automation

Proxies serve as the clandestine agents of the internet, masking your IP address and enabling you to traverse digital frontiers with subtlety. In the context of web automation—where Selenium and Puppeteer pirouette in the browser’s theatre—proxies are indispensable for circumventing rate limits, geo-restrictions, and surveillance. Free proxies, though capricious and ephemeral, can suffice for lightweight, non-critical scraping or testing scenarios.

Types of Proxies and Their Characteristics

Proxy Type Anonymity Level Protocols Supported Typical Use-Case Reliability
HTTP Low to Medium HTTP, HTTPS Simple web scraping Low
SOCKS4/5 High SOCKS4, SOCKS5 Complex protocols, HTTPS Medium
Transparent None (reveals IP) HTTP, HTTPS Caching, internal use Very Low
Elite/Anonymous High HTTP, HTTPS Bypassing geo-blocks Medium

For a compendium of free proxy lists, peruse https://free-proxy-list.net/ or https://www.sslproxies.org/.

Using Free Proxies with Selenium (Python)

1. Installing Dependencies

pip install selenium

Download the latest ChromeDriver compatible with your Chrome version.

2. Configuring a Proxy in Selenium

The browser, that digital marionette, can be commanded thus:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

proxy = "186.121.235.66:8080"  # Replace with your free proxy

options = Options()
options.add_argument(f'--proxy-server=http://{proxy}')

driver = webdriver.Chrome(options=options)
driver.get('https://httpbin.org/ip')

print(driver.page_source)
driver.quit()

Table: Common Chrome Proxy Switches

Option Description
--proxy-server=http://IP:PORT Set HTTP proxy
--proxy-server=https=IP:PORT Set HTTPS proxy
--proxy-bypass-list=localhost;127.0.0.1 Exclude addresses from proxy

3. Using Proxies with Authentication

Free proxies with authentication are rare gems, but should you chance upon one:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

# Authenticated proxies require a Chrome extension workaround
from selenium.webdriver.common.by import By
import zipfile

proxy_host = 'proxy.example.com'
proxy_port = 8000
proxy_user = 'user'
proxy_pass = 'pass'

manifest_json = """
{
  "version": "1.0.0",
  "manifest_version": 2,
  "name": "Chrome Proxy",
  "permissions": [
    "proxy",
    "tabs",
    "unlimitedStorage",
    "storage",
    "<all_urls>",
    "webRequest",
    "webRequestBlocking"
  ],
  "background": {
    "scripts": ["background.js"]
  }
}
"""

background_js = f"""
var config = {{
        mode: "fixed_servers",
        rules: {{
          singleProxy: {{
            scheme: "http",
            host: "{proxy_host}",
            port: parseInt({proxy_port})
          }},
          bypassList: ["localhost"]
        }}
      }};
chrome.proxy.settings.set({{value: config, scope: "regular"}}, function() {{}});
function callbackFn(details) {{
    return {{
        authCredentials: {{
            username: "{proxy_user}",
            password: "{proxy_pass}"
        }}
    }};
}}
chrome.webRequest.onAuthRequired.addListener(
    callbackFn,
    {{urls: ["<all_urls>"]}},
    ['blocking']
);
"""

# Create the proxy extension
pluginfile = 'proxy_auth_plugin.zip'
with zipfile.ZipFile(pluginfile, 'w') as zp:
    zp.writestr("manifest.json", manifest_json)
    zp.writestr("background.js", background_js)

chrome_options = Options()
chrome_options.add_extension(pluginfile)

driver = webdriver.Chrome(options=chrome_options)
driver.get('https://httpbin.org/ip')

Reference: Selenium Proxy Authentication (GitHub Gist)

Using Free Proxies with Puppeteer (Node.js)

1. Installing Puppeteer

npm install puppeteer

2. Launching Puppeteer with a Proxy

Let the browser don its new mask:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    args: ['--proxy-server=http://186.121.235.66:8080']
  });
  const page = await browser.newPage();
  await page.goto('https://httpbin.org/ip');
  const body = await page.content();
  console.log(body);
  await browser.close();
})();

3. Handling Proxy Authentication

When the gatekeeper demands credentials:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    args: ['--proxy-server=http://proxy.example.com:8000']
  });
  const page = await browser.newPage();

  await page.authenticate({
    username: 'user',
    password: 'pass'
  });

  await page.goto('https://httpbin.org/ip');
  const body = await page.content();
  console.log(body);
  await browser.close();
})();

4. Rotating Proxies in Puppeteer

A ballet of ephemeral identities, orchestrated thus:

const proxies = [
  'http://proxy1:port',
  'http://proxy2:port',
  'http://proxy3:port'
];

const puppeteer = require('puppeteer');

(async () => {
  for (const proxy of proxies) {
    const browser = await puppeteer.launch({
      args: [`--proxy-server=${proxy}`]
    });
    const page = await browser.newPage();
    await page.goto('https://httpbin.org/ip');
    const body = await page.content();
    console.log(`Proxy: ${proxy}\n${body}\n`);
    await browser.close();
  }
})();

Free Proxy Sources

Name URL Features
Free Proxy List https://free-proxy-list.net/ Large HTTP/S list, updated
SSL Proxies https://www.sslproxies.org/ HTTPS proxies, fast update
ProxyScrape https://proxyscrape.com/free-proxy-list Multiple protocols
Spys.one http://spys.one/en/ Advanced filtering

Best Practices and Limitations

  • Ephemeral Nature: Free proxies often vanish without notice; monitor their liveness using tools like ProxyChecker.
  • Speed & Reliability: Expect latency, timeouts, and the occasional dead end.
  • Security: Never use free proxies for sensitive accounts—man-in-the-middle attacks lurk in the shadows.
  • Legal & Ethical Considerations: Always respect robots.txt and terms of service.

Proxy Validation Example (Python)

Before invoking your browser, test the proxy’s pulse:

import requests

proxy = "186.121.235.66:8080"
proxies = {"http": f"http://{proxy}", "https": f"http://{proxy}"}

try:
    response = requests.get('https://httpbin.org/ip', proxies=proxies, timeout=5)
    print(response.json())
except Exception as e:
    print(f"Proxy failed: {e}")

Resource Links

Théophile Beauvais

Théophile Beauvais

Proxy Analyst

Théophile Beauvais is a 21-year-old Proxy Analyst at ProxyMist, where he specializes in curating and updating comprehensive lists of proxy servers from across the globe. With an innate aptitude for technology and cybersecurity, Théophile has become a pivotal member of the team, ensuring the delivery of reliable SOCKS, HTTP, elite, and anonymous proxy servers for free to users worldwide. Born and raised in the picturesque city of Lyon, Théophile's passion for digital privacy and innovation was sparked at a young age.

Comments (0)

There are no comments here yet, you can be the first!

Leave a Reply

Your email address will not be published. Required fields are marked *