Migrating from one Chef server to another

It happens — you’re on a server that just can’t be upgraded any further, and you need more resources.  Or, you need to backup a Chef server.  Or, you need to setup a QA instance.  Or, you need to finally migrate from Chef 10 to Chef 11.  Or, you have one of many other possible reasons, but you need to be able to stand up a new Chef instance, and not have to do a ton of work.  If any of that applies to you, then this post is for you.

In the case where you’re migrating from one Chef server to another (i.e., the old one is going bye-bye), it would be very helpful to have your Chef server be CNAMEd (e.g. chef.company.com -> vm101.iad.company.com) or behind a load balancer/proxy where you can change targets easily.  That way, you won’t need to update the client configs, and it’ll be an easy swap.  Everything should “just work” ™.

First, we’ll make a copy of your knife.rb:

Now, we’ll need to get access to your new Chef server via knife.  You can do so by logging in as admin, and regenerating and saving a new private key.  You can also create a new user here instead of using admin, but I advise against this, as any user you create will conflict with users of the same name from the old server.  Yes, that means that if you’ve been using ‘admin’ as the main user, you may run into problems (but let’s just hope that you’ve been using per-person accounts).

Now, we’ll update your current knife.rb to reflect the new node information in it:

It wouldn’t hurt to check that you have access to the new node by doing a  knife user list .

Now, we’ll need to download all of the data from the “old” Chef server.  To do so, we’ll be using the nifty ‘knife backup‘ plugin.  To get it installed on OS X, I did:

Now, to finally back things up, we’ll do:

Note that the argument after -D is the destination directory where all of the Chef data will go; this directory will automatically be created for you.  The argument of -c tells knife which config file to use; we’ll, of course, be using the “old” server here.  Also, if you only need to backup a certain set of data from your Chef server (e.g. only users and environments), you can specify that.  See the knife backup documentation for details.

Now that we have all the data we need, we’ll need to push it up to the new server.  This works much the same as the export:

I left off the -c here because knife.rb is the default config file.

Once everything has been restored, your original user in Chef will now be available (you can verify this via the Chef Server UI).  The amazing thing is that your keys have not changed, and can be used as-is.  Chef Server keeps track of your public keys, so all of your private keys for all nodes/clients are still good.

This, now, is where you update your knife.rb to reflect your original user settings.  If you’re running behind a load balancer/proxy, you can simply use your original config as-is after replacing the old server with the new one.  If you’re doing the CNAME/A record route, you can do the same once DNS has propagated.  Otherwise, you can overwrite your new config with your old one, and edit it to reflect the new server’s URL.

If your nodes are pointing to the wrong server in their client.rb, you can use knife ssh with sed to find/replace the server URLs.

If you’ll be accessing multiple Chef servers frequently enough, I highly recommend looking at the knife block plugin.  That way, you can switch between different configurations with ease, including those for Berkshelf.

7 comments on “Migrating from one Chef server to another
  1. Phil Nguyen says:

    Hi Ameir,
    The backup operation completed successfully (i.e. list of folders with json files etc..). However, the restore operation failed to process the backup folder as shown below. Do you know what am I missing? I will retry this using a Linux box to see if that will help. Thanks for the script. It will save a lot of pain if this works.

    D:\P4\depot\vault\main\hpool\chef-repo>knife backup restore -D d:\chef-backup -c C:\Users\pnguyen\.chef\knife.rb
    WARNING: This will overwrite existing data!
    Do you want to restore backup, possibly overwriting exisitng data? (Y/N)Y
    === Restoring clients ===
    === Restoring users ===
    === Restoring nodes ===
    === Restoring roles ===
    === Restoring data bags ===
    === Restoring environments ===
    === Restoring cookbooks ===

  2. Hi Phil,

    Could you go into d:\chef-backup and run knife diff? That’ll compare the local folder with the remote server, and let you know if there are differences. It’s possible that the files are the same (are you using the correct config file?). You could also try with a trailing slash; I don’t have a Windows box to test with, but there may be nuances there. Also, you could use knife upload instead of knife backup. The former is essentially what the latter does behind the scenes. To try that, go into d:\chef-backup and do knife upload .. Hopefully that’ll work. Let me know if it doesn’t and I’ll try to help out.


  3. Phil Nguyen says:

    Update: FYI, it worked when executing the backup/restore script via Ubuntu workstation. Thank you.

  4. Excellent, glad to hear it! There must be an issue on the Windows side of things. Good luck with your new Chef server!

  5. gdanko says:

    I am seeing this:
    === Restoring cookbooks ===
    Restoring cookbook [“publiccloud_lms_install_jdk”]
    Uploading publiccloud_lms_install_jdk [0.1.0]
    ERROR: Server returned error 500 for https://localhost/sandboxes/00000000000012b561684b15f8b1df3f, retrying 1/5 in 4s
    ERROR: Server returned error 500 for https://localhost/sandboxes/00000000000012b561684b15f8b1df3f, retrying 2/5 in 7s
    ERROR: Server returned error 500 for https://localhost/sandboxes/00000000000012b561684b15f8b1df3f, retrying 3/5 in 13s
    ERROR: Server returned error 500 for https://localhost/sandboxes/00000000000012b561684b15f8b1df3f, retrying 4/5 in 29s
    ERROR: Server returned error 500 for https://localhost/sandboxes/00000000000012b561684b15f8b1df3f, retrying 5/5 in 54s
    ERROR: internal server error
    Response: internal service error

    Any idea what could be wrong?

  6. A 500 error means that something server-side is having issues. Are you able to upload anything to your Chef server? Could you also add –verbose to your command to see if it gives any additional details?

  7. Frederick N. Brier says:

    I am working on using this blog post to migrate a Chef Server 11 to a Chef Server 12. When I got to the validation step, “knife user list”, it failed with “ERROR: You authenticated successfully to as admin but you are not authorized for this action”, “Response: missing read permission”. The answer was derived from a post on a Japanese blog by Masaya Aoyama. With Chef 12 we need to specify the organization in the knife.rb “chef_server_url”, property by appending “/organizations/” to the URL, so the value becomes ‘https:///organizations/’. I am not done doing the restore yet, but thank you for this post!

Leave a Reply

Your email address will not be published. Required fields are marked *