A Fix for OperationalError: (psycopg2.OperationalError) SSL error: decryption failed or bad record mac

I kept seeing this error message with a uwsgi/Flask/SQLAlchemy application:

[2019-09-01 23:43:34,378] ERROR in app: Exception on / [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "./http.py", line 11, in foo
    session.execute('SELECT 1')
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 162, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1269, in execute
    clause, params or {}
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
OperationalError: (psycopg2.OperationalError) SSL error: decryption failed or bad record mac

This issue was difficult to figure out. In fact, I was annoyed enough I made a simple reproduction repo. Searches of StackOverflow did not offer great solutions. The closest I found was the following post:

uWSGI, Flask, sqlalchemy, and postgres: SSL error: decryption failed or bad record mac

The accepted solution suggests using lazy-apps = true in uwsgi.ini. This isn’t a great solution because of the increased memory use. When I enabled this on my application, it slowed down noticeably, but the problem was gone!

The post gave me a clue about the nature of the problem. When uwsgi starts, the python application is imported, then N forks (called workers in uwsgi) of the application are copied (N = processes in uwsgi.ini). This post from Ticketea engineering illustrated this nicely.

Most of the time this is OK! My application’s particular use of psycopg2 was problematic. On the initial application startup, a database connection/engine was created with SQLalchemy. That same connection was copied to the uwsgi worker processes. Psycopg documentation calls out uses like this in its documentation:

The Psycopg module and the connection objects are thread-safe: many threads can access the same database either using separate sessions and creating a connection per thread or using the same connection and creating separate cursors. In DB API 2.0 parlance, Psycopg is level 2 thread safe.

The documentation continues, with the important point:

The above observations are only valid for regular threads: they don’t apply to forked processes nor to green threads. libpq connections shouldn’t be used by a forked processes, so when using a module such as multiprocessing or a forking web deploy method such as FastCGI make sure to create the connections after the fork.

A ha! My connection that was created on application import/startup needed to be torn down! This is an easy enough fix: calling engine.dispose after I’ve used the import/startup time connection.