# Web
# Url
# Encoding
+
means a space only in application/x-www-form-urlencoded content, such as the query part of a URL
http://www.example.com/path/foo+bar/path?query+name=query+value
in this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.
%20
is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20
and pluses with %2B
from urllib.parse import quote, quote_plus
quote(' ') # %20, similar like encodeURIComponent in js
quote_plus(' ')# +, %20 is recommended for convenience
from requests.utils import requote_uri
requote_uri("http://www.sample+d.com/?id=123 abc") # http://www.sample+d.com/?id=123%20abc, similar like encodeURI
# Selenium
XPATH usage
firefox:
function getElementsByXPath(xpath, parent) { let results = []; let query = document.evaluate(xpath, parent || document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null); for (let i = 0, length = query.snapshotLength; i < length; ++i) { results.push(query.snapshotItem(i)); } return results; }
chrome:
$x
chromedriver:
driver.switch_to.frame(0) driver.switch_to.parent # search xpath in iframe
XPATH syntax
/following-sibling::
: next sibling/..
: parent/*
: child//*
: all descendants
# requests
auth
: user & password in http:user:password@hostname,auth=('user', 'password')
data
: default content type isapplication/x-www-form-urlencoded
; custom content type in headers, but not supportingmultipart/form-data
files
: content type is multipart/form-data; data only or tuple of things- {name: (filename[can be None], fileobj[open() or text], content_type[optional], custom_headers[optional])}
json
: content type isapplication/json
WARNING
to post a file in multipart, in flask's test_client:
client.post(url, data={'document': open('file_path', 'rb')}, headers={'content-type': 'multipart/form-data'})
# Api server
WSGI stands for Web Server Gateway Interface, and ASGI stands for Asynchronous Server Gateway interface. They both specify the interface and sit in between the web server and a Python web application or framework.
# Flask
from flask import Flask, Blueprint, request, jsonify
app = Flask(__name__)
# blueprint is used for common prefix
api = Blueprint('serverless_handler', __name__)
@api.route('/', methods=['GET']) # may not starts with slash
def home():
return 'Hi there' # plain text is acceptable
app.register_blueprint(api, url_prefix='/api/webhook')
to start: FLASK_APP=app.py flask run
# Flask + Gunicorn
Run a gunicorn app in pycharm
- scripts => /path/to/env/bin/gunicorn
- script parameter =>
-c python:gunicorn_conf package_path.app:app
- initialize
logconfig_dict
,workers
, andtimeout
ingunicorn_conf.py
# connexion.FlaskApp
wrapper of flask.Flask.App
# Start an app
inside package_path/app.py
if __name__ == '__main__':
app.run(port=8000)
# request
test_client = connexion.FlaskApp.app.test_client()
# query_string appears in url
# json appears in request body
# data accepts tuple of (data, filename), if uploading files
test_client.get('route_only', query_string={}, json={}, data={})
# OpenAPI
API description format for REST APIs. By leveraging json schema, OpenApi can also validate request and response format.
# Unstability in OpenAPI
OpenAPI 3.0 uses an extended subset of JSON Schema Specification Wright Draft 00 (aka Draft 5) to describe the data formats.
OpenAPI tools: editor
# Unstability in json schema
Info on json schema development
Caveats:
- list of dict check: empty list or list of dict with required properties are both valid
# Parameter Handling
In the OpenAPI 3.x.x spec, the requestBody does not have a name. By default it will be passed in as body
. You can optionally provide the x-body-name
parameter in your operation (or legacy position within the requestBody schema) to override the name of the parameter that will be passed to your handler function.
/path
post:
requestBody:
x-body-name: body
content:
application/json:
schema:
# Some examples
# File as in request body
openapi:
requestBody:
content:
multipart/form-data:
schema:
type: object # required for swagger to add widget
properties:
document:
type: string
format: binary
swagger:
# Json as in multipart/form-data:
openapi:
/test:
post:
summary: test
operationId: api.test
tags:
- document
requestBody:
content:
multipart/form-data:
encoding:
body:
contentType: application/json
schema:
type: object
required: [body]
properties:
body:
type: object
required:
- fields
properties:
fields:
type: array
accepts:
--610267985175094a52c9b65216d3e15c
Content-Disposition: form-data; name="body"
{"fields": []}
--610267985175094a52c9b65216d3e15c--
access:
def test(body): # see Parameter Handling
body['body']
# sanic
from sanic import Sanic
from sanic.response import text
app = Sanic('MyHelloWorldApp')
@app.get('/')
async def hello_world(request):
return text('Hello, world.')
to start: sanic path.file:app
# aiohttp
from aiohttp import web
routes = web.RouteTableDef()
@routes.get('/') # path must starts with /
async def hello(request):
return web.Response(text="Hello, world") # plain text must be wrapped in web.Response
app = web.Application()
app.add_routes(routes)
if __name__ == '__main__':
web.run_app(app)
Regards with cancellation: https://github.com/aio-libs/aiohttp/pull/6727/commits
# request
def create_app():
api_path = pathlib.Path(__file__).parent / 'openapi'
app = AioHttpApp(__name__, arguments={
"API_VERSION": os.getenv('API_VERSION', 'devel'),
"GIT_COMMIT_SHA": os.getenv('GIT_COMMIT_SHA', 'devel'),
},
specification_dir=api_path.as_posix(),
server_args={'client_max_size': int(CLIENT_MAX_SIZE)}
)
app.add_api('api.yaml', validate_responses=False, base_path='/ai')
return app
# pip install pytest-aiohttp==0.3.0
@pytest.fixture(scope='function')
def test_client(aiohttp_client, loop):
app = create_app().app
return loop.run_until_complete(aiohttp_client(app))
# https://docs.aiohttp.org/en/stable/client_reference.html
def test_x(test_client):
# The data to send in the body of the request. This can be a FormData object or anything that can be passed into FormData, e.g. a dictionary, bytes, or file-like object. (optional)
# json: Any json compatible python object (optional). json and data parameters could not be used at the same time.
# params: Mapping, iterable of tuple of key/value pairs or string to be sent as parameters in the query string of the new request. Ignored for subsequent redirected requests
test_client.get('route_only', params=?, data=?, json={})
# Remarks
data={'key': 'json_string', 'key2': b'bytes'}
=> if bytes exist, automatically multipart/form-data and json_string is converted to json type
# Celery
Distributed system
To start a worker:
celery -A package.module_with_celery_instance --concurrency=1 --loglevel=DEBUG
Remark:
celery.autodiscover_tasks(package_list, name, force=False)
discovers tasks in lazy mode- file where tasks is defined contains few imports and dependency is easy to solve
- otherwise, package must be explicitly imported in the start file
- otherwise, change
force
totrue