-
-
Notifications
You must be signed in to change notification settings - Fork 7
Search API
Creating a robust API for checking domains against a blacklist requires a few essential considerations:
- Speed: As an affiliate network, you'd want to quickly check if a given domain is on the blacklist. It would be best if you had efficient storage and lookup mechanisms.
- Scalability: Your service should handle large numbers of requests, especially if it's being used by big affiliate networks.
- Security: Protect your API against misuse. Consider rate-limiting, access controls, and other measures to ensure only authorized users can make requests.
Let's start with a simple Flask-based API which relies on in-memory storage for speed:
Ensure you have the required packages:
pip install Flask Flask-Limiter
from flask import Flask, request, jsonify
from flask_limiter import Limiter
import requests
app = Flask(__name__)
limiter = Limiter(app, key_func=get_remote_address, default_limits=["200 per day", "50 per hour"])
BLACKLIST_URL = "https://get.domainsblacklists.com/blacklist.txt"
cached_blacklist = set()
def update_blacklist():
global cached_blacklist
response = requests.get(BLACKLIST_URL)
if response.status_code == 200:
cached_blacklist = set(response.text.splitlines())
@app.route('/check-domain', methods=['POST'])
@limiter.limit("10 per minute")
def check_domain():
domain = request.json.get('domain')
if not domain:
return jsonify(error='Domain not provided'), 400
if domain in cached_blacklist:
return jsonify(status="blacklisted")
else:
return jsonify(status="safe")
if __name__ == '__main__':
update_blacklist() # Update the blacklist when the app starts
app.run(debug=True)
-
In-Memory Storage: The script above caches the blacklist in memory. While this approach is very fast, it may not be scalable for massive blacklists. Consider using a more scalable data store (e.g., Redis) for larger use cases.
-
Rate Limiting: We're using Flask-Limiter to limit requests to 10 requests per minute for each client IP to prevent abuse. Adjust these limits based on your specific requirements and capacity.
-
Auto-Updating the Blacklist: The above script loads the blacklist once when the app starts. You might want to update the blacklist periodically. One way to do this is by using a background task scheduler like Celery.
-
Authentication: The provided API is open, meaning anyone can access it. In a production setting, you'll want to implement some form of authentication, such as API key-based authentication, to ensure only authorized users can access your service.
Deploying this API on a robust cloud server, behind a load balancer, would ensure high availability and scalability. If you expect a very high request rate, consider also deploying this in a containerized environment like Kubernetes.