NoSQL

Testing Network Errors With MongoDB

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you’re testing your application’s handling of network hiccups.

You have options: you could use mongobridge to proxy between the client and the server, and at just the right moment, kill mongobridge.

Or you could use packet-filtering tools to accomplish the same: iptables on Linux and ipfw or pfctl on Mac and BSD. You could use one of these tools to block MongoDB’s port at the proper moment, and unblock it afterward.
There’s yet another option, not widely known, that you might find simpler: use a MongoDB “failpoint” to break your connection.

Failpoints are our internal mechanism for triggering faults in MongoDB so we can test their consequences. Read about them on Kristina’s blog. They’re not meant for public consumption, so you didn’t hear about it from me.

The first step is to start MongoDB with the special command-line argument:

mongod --setParameter enableTestCommands=1

Next, log in with the mongo shell and tell the server to abort the next two network operations:

> db.adminCommand({
...   configureFailPoint: 'throwSockExcep',
...   mode: {times: 2}
... })
2014-03-20T20:31:42.162-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The server obeys you instantly, before it even replies, so the command itself appears to fail. But fear not: you’ve simply seen the first of the two network errors you asked for. You can trigger the next error with any operation:

> db.collection.count()
2014-03-20T20:31:48.485-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The third operation succeeds:

> db.collection.count()
2014-03-20T21:07:38.742-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-03-20T21:07:38.742-0400 reconnect 127.0.0.1:27017 (127.0.0.1) ok
1

There’s a final “failed” message that I don’t understand, but the shell reconnects and the command returns the answer, “1”.

You could use this failpoint when testing a driver or an application. If you don’t know exactly how many operations you need to break, you could set times to 50 and, at the end of your test, continue attempting to reconnect until you succeed.

Ugly, perhaps, but if you want a simple way to cause a network error this could be a reasonable approach.

Reference: Testing Network Errors With MongoDB from our WCG partner Jesse Davis at the A. Jesse Jiryu Davis blog.

Jesse Davis

Jesse is a senior engineer at MongoDB in New York City. He specializes in Python, MongoDB drivers, and asynchronous frameworks.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button