I was interested in using ETL services like Stitch Data or Skyvia to integrate different data sources (including my Discourse database), but I have been told by someone at Skyvia that this is not possible:
Skyvia can connect to PostgreSQL through SSH, however it is not possible to connect to it when it’s inside a docker container while SSH server is not in the container but in front of it.
You could enable ssh in the discourse container (on a non standard port) and then allow them to connect there. I think there may be an example in the samples directory of Discourse_docker.
I seem to be missing a key concept because I’m able to connect via ssh with the custom port and then do su postgres -c 'psql discourse' without a problem. Everything works in this two-step approach but I think what I need in order to connect directly via pgAdmin (for example) is something slightly different.
This is the command I’m using to expose a custom port:
Which allows me to later do this directly (without running the docker container via launcher enter app):
ssh whatever@host -p 2222
su postgres -c 'psql discourse'
I’ve tried several things, but unsuccessfully. I feel like there should be a way to do ssh whatever@host -p XXXX and connect directly to the database (which is probably what pgAdmin is expecting)
You need to expose the PostgreSQL port directly for you to be able to connect via pgAdmin.
In the app.yml, near the top, you see the 80 and 443 ports open. You can add another line for the port 5432 for PostgreSQL.
That said, this is mostly certainly a very bad idea. The database went from accepting only local connections to being exposed to the whole internet.
If all you need is some occasional reporting, downloading some CSVs from Data Explorer and loading those in your favorite tool may be enough. You can also download Discourse backups (without uploads) and those are just the standard PostgreSQL dump format. With that in hand you can restore it to a local PostgreSQL instance for analysis.
If you add 5432 to the app.yml it is exposed directly, without needing the SSH tunnel.
I can’t give advice on the pgAdmin SSH tunnel, as I have never used it. I assume it expects the port to list to local connections, so it doesn’t need to be exposed to the internet.
But there isn’t a postgres password because it requires being a superuser: the pg_hba.conf file has “local” connection permissions set to “peer”, so it depends on the UNIX user, which requires logging via SSH, no?
Right, I have no problem connecting from the app docker container. My problem is connecting directly to the postgres DB from my local machine (so I can use pgAdmin) or from a cloud data processor like Stitch. Both of these expect a host IP address and SSH credentials, but I haven’t been able to get them to work (I get the error I showed above).
The only thing I’ve been able to do is use docker-ssh to access the app docker container directly (via publicKey) from my local computer (without doing launcher enter app), but I still need to do su postgres 'psql discourse' in order to access the DB, which I assume is the problem with pgAdmin/Stitch—they expect a direct connection.