Chat with us, powered by LiveChat

Hi, I’m Tom, from AWS and today I’m joined by James from Zappos.

– Welcome, James.

– Hi, Tom.

So, tell us.

What does Zappos do? Zappos is a service company that happens to sell blank and I think we’re most well-known for our customer service as well as the shoe selection that we sell online.

– Everybody loves Zappos.

– Absolutely.

Yeah, so what’s your role at Zappos? I’m a Lead Software Engineer.

And I’m over the back office systems team.


So we’re gonna talk about some ofyour back office stuff here today.

– What are we looking at?- So, what we’re looking at is a simplified version of one of our architectures that we help support.

We pretty much have a front end and a back end.

– And if you don’t mind, I’ll kind of.



— Yeah, just jump into it.


So, one of the original problemsthat we were trying to solve was the fact that we had an inconsistency between our production deploys as well as our development.

And because of that, that was creating support issues and everything else, so when I joined the team and I was like: “We need to fix this.

” So what we did was to help make those consistent, we introduced Docker, then the docker was able to do the consistent deploy between the environmentsand then that reduced support, gave better customer experience toour internal customers and we went from there.



And I guess that why you’re using ECS.


So then when we did the migration to AWS, ECS was a natural fit for that.


So walk us through the data flow here, Where do we start? Sure.

We have a human readable DNSwhich is provided to us from Route 53, so then from there, that goes to Application Load Balancer and that then goes to our last Elastic Container Services which we have two instances running hereas dedicated but it’s also set up to scale so if we need to get up to ‘n’, then we can get the.



– Sure, to get the scalability.

– Exactly.

And then from there the front end then, calls the back end, which goes back to our Route 53, which then goes to our back end’sApplication Load Balancer which then goes to another ECS, which again has two dedicated instances, and then can scale as needed.


So same patterning on the front end and the back end? Correct.

And then the back end has a connection to Aurora MySQL, and then also part of its processes, it actually generates messages and consumes messages from an Amazon MQ setup.

Amazon MQ? So tell us about the choice of using Amazon MQ.


So part of this, this was a slightly older code base, and we had an older version of Active MQ that we were using, and we had several instances where we had production issues.

So when we were doing the migration to AWS we were like “Hey, can we actually do a replacement?” And part of the reason why we actually chose Amazon MQ is because the protocol is pretty much like a one-to-one with Active MQ.

So it just allowed you to take those applications that you already hadand plug them right into Amazon MQ.

– Exactly.

– So you’ve got a cool scalable architecture here, and you’re pushing that data into Aurora MySQL, but did you start with Aurora when you started that migration? No, we actually started off of an Oracle database and then part of a directive from above wasthat we needed to get off of Oracle, so we looked at Aurora PostGres originallyand because the syntax between.

– – That would make a lot of sense.

– Mm-hmm.

Because the syntax is very similar, but our team was also supporting a bunch of other databases that were similar to MySQL.



– OK.

– and the team had more experience so they could do better performance engineering utilizing MySQL.

Got it.

So what did that take? How did you go fromOracle, this proprietary database, over to MySQL? Sure.

So what we did was we actually utilized DMS and we actually did a data dumpinto a database, so that we actually could do some light performance testing to make sure that two things: 1) When we did the query translation for the variances in the syntax and like, functions that may not exist, that we were actually getting the same results and then after that, we then looked at the timing of the queries and made sure they were like within the standard deviation.

And if they weren’t, then we did performance engineering where we added additional indexes, compound indexes, and we even did some setups where we would actually do a query, write it to a temporary table, for the equivalent of an MView, so that we could then query against that, and then in that particular, I remember one instance where we improved a query from 15 seconds – down to 15 milliseconds.

– Say that again.

We had a query that we reduced the timefrom 15 seconds to 15 milliseconds because once we wrote it, it just made itthat much easier to query.

That’s incredible.

So the use of the Database Migration Service, – it sounds like it really helped you.

– Yes, it was pretty much the foundation for the migration and then, after that, it fell down to a lot of – performance engineering after that.

– That’s great.

And how is it working for you now? Fantastic.

It’s running fast, reliable, scalable.

We had it set up, so we had to rewrite endpoints that we’re utilizing and the readers can actually scale.

So if we hit a big performance for a relational database, we still have a way to actually scale with that.


So you’ve got one instance that you’re writing to, – and then a whole bunch of read instances for throughput.

– Correct.


– So one writer – one reader.

However, we can.



– You can scale up the readers.


So we can scale horizontally instead of having to do it vertically for the relational.


Well, this is really cool architecture.

You solved some really big problems in creative ways.

Thank you for sharing your architecturewith us today and thank you for watching.

This is My Architecture.



Leave a comment