Jump to Navigation

Cloud

Get to know the newest AWS Heroes – Winter 2019

AWS Blog - 9 hours 11 min ago

AWS Heroes are superusers who possess advanced technical skills and are early adopters of emerging technologies. Heroes are passionate about sharing their extensive AWS knowledge with others. Some get involved in-person by running meetups, workshops, and speaking at conferences, while others share with online AWS communities via social media, blog posts, and open source contributions.

2019 is off to a roaring start and we’re thrilled to introduce you to the latest AWS Heroes:

Aileen Gemma Smith
Ant Stanley
Gaurav Kamboj
Jeremy Daly
Kurt Lee
Matt Weagle
Shingo Yoshida

Aileen Gemma Smith – Sydney, Australia

Community Hero Aileen Gemma Smith is the founder and CEO of Vizalytics Technology. The team at Vizalytics serves public and private sector clients worldwide in transportation, tourism, and economic development. She shared their story in the Building Complex Workloads in the Cloud session, at AWS Canberra Summit 2017. Aileen has a keen interest in diversity and inclusion initiatives and is constantly working to elevate the work and voices of underestimated engineers and founders. At AWS Public Sector Summit Canberra in 2018, she was a panelist for We Power Tech, Inclusive Conversations with Women in Technology. She has supported and encouraged the creation of internships and mentoring programs for high school and university students with a focus on building out STEAM initiatives.
 

 

 

 

 

 

Ant Stanley – London, United Kingdom

Serverless Hero Ant Stanley is a consultant and community organizer. He founded and currently runs the Serverless London user group, and he is part of the ServerlessDays London organizing team and the global ServerlessDays leadership team. Previously, Ant was a co-founder of A Cloud Guru, and responsible for organizing the first Serverlessconf event in New York in May 2016. Living in London since 2009, Ant’s background before serverless is primarily as a solutions architect at various organizations, from managed service providers to Tier 1 telecommunications providers. His current focus is serverless, GraphQL, and Node.js.

 

 

 

 

 

 

 

 

Gaurav Kamboj – Mumbai, India

Community Hero Gaurav Kamboj is a cloud architect at Hotstar, India’s leading OTT provider with a global concurrency record for live streaming to 11Mn+ viewers. At Hotstar, he loves building cost-efficient infrastructure that can scale to millions in minutes. He is also passionate about chaos engineering and cloud security. Gaurav holds the original “all-five” AWS certifications, is co-founder of AWS User Group Mumbai, and speaks at local tech conferences. He also conducts guest lectures and workshops on cloud computing for students at engineering colleges affiliated with the University of Mumbai.

 

 

 

 

 

 

 

 

Jeremy Daly – Boston, USA

Serverless Hero Jeremy Daly is the CTO of AlertMe, a startup based in NYC that uses machine learning and natural language processing to help publishers better connect with their readers. He began building cloud-based applications with AWS in 2009. After discovering Lambda, became a passionate advocate for FaaS and managed services. He now writes extensively about serverless on his blog, jeremydaly.com, and publishes Off-by-none, a weekly newsletter that focuses on all things serverless. As an active member of the serverless community, Jeremy contributes to a number of open-source serverless projects, and has created several others, including Lambda API, Serverless MySQL, and Lambda Warmer.

 

 

 

 

 

 

 

Kurt Lee – Seoul, South Korea

Serverless Hero Kurt Lee works at Vingle Inc. as their tech lead. As one of the original team members, he has been involved in nearly all backend applications there. Most recently, he led Vingle’s full migration to serverless, cutting 40% of the server cost. He’s known for sharing his experience of adapting serverless, along with its technical and organizational value, through Medium. He and his team maintain multiple open-source projects, which they developed during the migration. Kurt hosts TechTalk@Vingle regularly, and often presents at AWSKRUG about various aspects of serverless and pushing more things to serverless.

 

 

 

 

 

 

 

Matt Weagle – Seattle, USA

Serverless Hero Matt Weagle leverages machine learning, serverless techniques, and a servicefull mindset at Lyft, to create innovative transportation experiences in an operationally sustainable and secure manner. Matt looks to serverless as a way to increase collaboration across development, operational, security, and financial concerns and support rapid business-value creation. He has been involved in the serverless community for several years. Currently, he is the organizer of Serverless – Seattle and co-organizer of the serverlessDays Seattle event. He writes about serverless topics on Medium and Twitter.

 

 

 

 

 

 

 

Shingo Yoshida – Tokyo, Japan

Serverless Hero Shingo Yoshida is the CEO of Section-9, CTO of CYDAS, as well as a founder of Serverless Community(JP) and a member of JAWS-UG (AWS User Group – Japan). Since 2012, Shingo has not only built a system with just AWS, but has also built with a cloud-native architecture to make his customers happy. Serverless Community(JP) was established in 2016, and meetups have been held 20 times in Tokyo, Osaka, Fukuoka, and Sapporo, including three full-day conferences. Through this community, thousands of participants have discovered the value of serverless. Shingo has contributed to these serverless scenes with many blog posts and books about serverless, including Serverless Architectures on AWS.

 

 

 

 

 

 

 

There are now 80 AWS Heroes worldwide. Learn about all of them and connect with an AWS Hero.

Categories: Cloud

Podcast #299: February 2019 Updates

AWS Blog - Mon, 02/18/2019 - 07:39

Simon guides you through lots of new features, services and capabilities that you can take advantage of. Including the new AWS Backup service, more powerful GPU capabilities, new SLAs and much, much more!

Chapters:

Service Level Agreements 0:17
Storage 0:57
Media Services 5:08
Developer Tools 6:17
Analytics 9:54
AI/ML 12:07
Database 14:47
Networking & Content Delivery 17:32
Compute 19:02
Solutions 21:57
Business Applications 23:38
AWS Cost Management 25:07
Migration & Transfer 25:39
Application Integration 26:07
Management & Governance 26:32
End User Computing 29:22

Additional Resources

Topic || Service Level Agreements 0:17

Topic || Storage 0:57

Topic || Media Services 5:08

Topic || Developer Tools 6:17

Topic || Analytics 9:54

Topic || AI/ML 12:07

Topic || Database 14:47

Topic || Networking and Content Delivery 17:32

Topic || Compute 19:02

Topic || Solutions 21:57

Topic || Business Applications 23:38

Topic || AWS Cost Management 25:07

Topic || Migration and Transfer 25:39

Topic || Application Integration 26:07

Topic || Management and Governance 26:32

Topic || End User Computing 29:22

About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Podcast 298: [Public Sector Special Series #6] – Bringing the White House to the World

AWS Blog - Thu, 02/14/2019 - 06:40

Dr. Stephanie Tuszynski (Director of the Digital Library – White House Historical Association) speaks about how they used AWS to bring the experience of the White House to the world.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Now Available – Five New Amazon EC2 Bare Metal Instances: M5, M5d, R5, R5d, and z1d

AWS Blog - Thu, 02/14/2019 - 05:09

Today we are launching the five new EC2 bare metal instances that I promised you a few months ago. Your operating system runs on the underlying hardware and has direct access to the processor and other hardware. The instances are powered by AWS-custom Intel® Xeon® Scalable Processor (Skylake) processors that deliver sustained all-core Turbo performance.

Here are the specs:

Instance Name Sustained All-Core Turbo
Logical Processors Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth m5.metal Up to 3.1 GHz 96 384 GiB – 14 Gbps 25 Gbps m5d.metal Up to 3.1 GHz 96 384 GiB 4 x 900 GB NVMe SSD 14 Gbps 25 Gbps r5.metal Up to 3.1 GHz 96 768 GiB – 14 Gbps 25 Gbps r5d.metal Up to 3.1 GHz 96 768 GiB 4 x 900 GB NVMe SSD 14 Gbps 25 Gbps z1d.metal Up to 4.0 GHz 48 384 GiB 2 x 900 GB NVMe SSD 14 Gbps 25 Gbps

The M5 instances are designed for general-purpose workloads, such as web and application servers, gaming servers, caching fleets, and app development environments. The R5 instances are designed for high performance databases, web scale in-memory caches, mid-sized in-memory databases, real-time big data analytics, and other memory-intensive enterprise applications. The M5d and R5d variants also include 3.6 TB of local NVMe SSD storage.

z1d instances provide high compute performance and lots of memory, making them ideal for electronic design automation (EDA) and relational databases with high per-core licensing costs. The high CPU performance allows you to license fewer cores and significantly reduce your TCO for Oracle or SQL Server workloads.

All of the instances are powered by the AWS Nitro System, with dedicated hardware accelerators for EBS processing (including crypto operations), the software-defined network inside of each Virtual Private Cloud (VPC), ENA networking, and access to the local NVMe storage on the M5d, R5d, and z1d instances. Bare metal instances can also take advantage of Elastic Load Balancing, Auto Scaling, Amazon CloudWatch, and other AWS services.

In addition to being a great home for old-school applications and system software that are licensed specifically and exclusively for use on physical, non-virtualized hardware, bare metal instances can be used to run tools and applications that require access to low-level processor features such as performance counters. For example, Mozilla’s Record and Replay Framework (rr) records and replays program execution with low overhead, using the performance counters to measure application performance and to deliver signals and context-switch events with high fidelity. You can read their paper, Engineering Record And Replay For Deployability, to learn more.

Launch One Today
m5.metal instances are available in the US East (N. Virginia and Ohio), US West (N. California and Oregon), Europe (Frankfurt, Ireland, London, Paris, and Stockholm), and Asia Pacific (Mumbai, Seoul, Singapore, Sydney, and Tokyo) AWS regions.

m5d.metal instances are available in the US East (N. Virginia and Ohio), US West (Oregon), Europe (Frankfurt, Ireland, Paris, and Stockholm), and Asia Pacific (Mumbai, Seoul, Singapore, and Sydney) AWS regions.

r5.metal instances are available in the US East (N. Virginia and Ohio), US West (N. California and Oregon), Europe (Frankfurt, Ireland, Paris, and Stockholm), Asia Pacific (Mumbai, Seoul, and Singapore), and AWS GovCloud (US-West) AWS regions.

r5d.metal instances are available in the US East (N. Virginia and Ohio), US West (N. California), Europe (Frankfurt, Paris, and Stockholm), Asia Pacific (Mumbai, Seoul, and Singapore), and AWS GovCloud (US-West) AWS regions.

z1d.metal instances are available in the US East (N. Virginia), US West (N. California and Oregon), Europe (Ireland), and Asia Pacific (Singapore and Tokyo) AWS regions.

The bare metal instances will become available in even more AWS regions as soon as possible.

Jeff;

 

Categories: Cloud

New – Infrequent Access Storage Class for Amazon Elastic File System (EFS)

AWS Blog - Wed, 02/13/2019 - 12:44

Amazon Elastic File System lets you create petabyte-scale file systems that can be accessed in massively parallel fashion from hundreds or thousands of EC2 instances and on-premises servers, while scaling on demand without disrupting applications. Since the mid-2016 launch of EFS, we have added many new features including encryption of data at rest and in transit, a provisioned throughput option when you need high throughput access to a set of files that do not occupy a lot of space, on-premises access via AWS Direct Connect, EFS File Sync, support for AWS VPN and Inter-Region VPC Peering, and more.

Infrequent Access Storage Class
Today I would like to tell you about the new Amazon EFS Infrequent Access storage class, as pre-announced at AWS re:Invent. As part of a new Lifecycle Management option for EFS file systems, you can now indicate that you want to move files that have not been accessed in the last 30 days to a storage class that is 85% less expensive. You can enable the use of Lifecycle Management when you create a new EFS file system, and you can enable it later for file systems that were created on or after today’s launch.

The new storage class is totally transparent. You can still access your files as needed and in the usual way, with no code or operational changes necessary.

You can use the Infrequent Access storage class to meet auditing and retention requirements, create nearline backups that can be recovered using normal file operations, and to keep data close at hand that you need on an occasional basis.

Here are a couple of things to keep in mind:

Eligible Files – Files that are 128 KiB or larger and that have not been accessed or modified for at least 30 days can be transitioned to the new storage class. Modifications to a file’s metadata that do not change the file will not delay a transition.

Priority – Operations that transition files to Infrequent Access run at a lower priority than other operations on the file system.

Throughput – If your file system is configured for Bursting mode, the amount of Standard storage determines the throughput. Otherwise, the provisioned throughput applies.

Enabling Lifecycle Management
You can enable Lifecycle Management and benefit from the Infrequent Access storage class with one click:

As I noted earlier, you can check this when you create the file system, or you can enable it later for file systems that you create from now on.

Files that have not been read or written for 30 days will be transitioned to the Infrequent Access storage class with no further action on your part. Files in the Standard Access class can be accessed with latency measured in single-digit milliseconds; files in the Infrequent Access class have latency in the low double-digits. Your next AWS bill will include information on your use of both storage classes, so that you can see your cost savings.

Available Now
This feature is available now and you can start using it today in all AWS Regions where EFS is available. Infrequent Access storage is billed at $0.045 per GB/Month in US East (N. Virginia), with correspondingly low pricing in other regions. There’s also a data transfer charge of $0.01 per GB for reads and writes to Infrequent Access storage.

Like every AWS service and feature, we are launching with an initial set of features and a really strong roadmap! For example, we are working on additional lifecycle management flexibility, and would be very interested in learning more about what kinds of times and rules you would like.

Jeff;

PS – AWS DataSync will help you to quickly and easily automate data transfer between your existing on-premises storage and EFS.

Categories: Cloud

Podcast #297: Reinforcement Learning with AWS DeepRacer

AWS Blog - Mon, 02/11/2019 - 09:52

How are ML Models Trained? How can developers learn different approaches to solving business problems? How can we race model cars on a global scale? Todd Escalona (Solutions Architect Evangelist, AWS) joins Simon to dive into reinforcement learning and AWS DeepRacer!

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Podcast 296: [Public Sector Special Series #5] – Creating Better Educational Outcomes Using AWS | February 6, 2019

AWS Blog - Wed, 02/06/2019 - 13:23

Cesar Wedemann (QEDU) talks to Simon about how they gather Education data and provide this data to teachers and public schools to improve education in Brazil. They developed a free-access portal that offers easy visualization of brazilian Education open data.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Learn about AWS Services & Solutions – February 2019 AWS Online Tech Talks

AWS Blog - Tue, 02/05/2019 - 16:16

Join us this February to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Application Integration

February 20, 2019 | 11:00 AM – 12:00 PM PTCustomer Showcase: Migration & Messaging for Mission Critical Apps with S&P Global Ratings – Learn how S&P Global Ratings meets the high availability and fault tolerance requirements of their mission critical applications using the Amazon MQ.

AR/VR

February 28, 2019 | 1:00 PM – 2:00 PM PTBuild AR/VR Apps with AWS: Creating a Multiplayer Game with Amazon Sumerian – Learn how to build real-world augmented reality, virtual reality and 3D applications with Amazon Sumerian.

Blockchain

February 18, 2019 | 11:00 AM – 12:00 PM PTDeep Dive on Amazon Managed Blockchain – Explore the components of blockchain technology, discuss use cases, and do a deep dive into capabilities, performance, and key innovations in Amazon Managed Blockchain.

Compute

February 25, 2019 | 9:00 AM – 10:00 AM PTWhat’s New in Amazon EC2 – Learn about the latest innovations in Amazon EC2, including new instances types, related technologies, and consumption options that help you optimize running your workloads for performance and cost.

February 27, 2019 | 1:00 PM – 2:00 PM PTDeploy and Scale Your First Cloud Application with Amazon Lightsail – Learn how to quickly deploy and scale your first multi-tier cloud application using Amazon Lightsail.

Containers

February 19, 2019 | 9:00 AM – 10:00 AM PTSecuring Container Workloads on AWS Fargate – Explore the security controls and best practices for securing containers running on AWS Fargate.

Data Lakes & Analytics

February 18, 2019 | 1:00 PM – 2:00 PM PTAmazon Redshift Tips & Tricks: Scaling Storage and Compute Resources – Learn about the tools and best practices Amazon Redshift customers can use to scale storage and compute resources on-demand and automatically to handle growing data volume and analytical demand.

Databases

February 18, 2019 | 9:00 AM – 10:00 AM PTBuilding Real-Time Applications with Redis – Learn about Amazon’s fully managed Redis service and how it makes it easier, simpler, and faster to build real-time applications.

February 21, 2019 | 1:00 PM – 2:00 PM PT – Introduction to Amazon DocumentDB (with MongoDB Compatibility) – Get an introduction to Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that makes it easy to run, manage & scale MongoDB-workloads.

DevOps

February 20, 2019 | 1:00 PM – 2:00 PM PTFireside Chat: DevOps at Amazon with Ken Exner, GM of AWS Developer Tools – Join our fireside chat with Ken Exner, GM of Developer Tools, to learn about Amazon’s DevOps transformation journey and latest practices and tools that support the current DevOps model.

End-User Computing

February 28, 2019 | 9:00 AM – 10:00 AM PTEnable Your Remote and Mobile Workforce with Amazon WorkLink – Learn about Amazon WorkLink, a new, fully-managed service that provides your employees secure, one-click access to internal corporate websites and web apps using their mobile phones.

Enterprise & Hybrid

February 26, 2019 | 1:00 PM – 2:00 PM PTThe Amazon S3 Storage Classes – For cloud ops professionals, by cloud ops professionals. Wallace and Orion will tackle your toughest AWS hybrid cloud operations questions in this live Office Hours tech talk.

IoT

February 26, 2019 | 9:00 AM – 10:00 AM PTBring IoT and AI Together – Learn how to bring intelligence to your devices with the intersection of IoT and AI.

Machine Learning

February 19, 2019 | 1:00 PM – 2:00 PM PTGetting Started with AWS DeepRacer – Learn about the basics of reinforcement learning, what’s under the hood and opportunities to get hands on with AWS DeepRacer and how to participate in the AWS DeepRacer League.

February 20, 2019 | 9:00 AM – 10:00 AM PTBuild and Train Reinforcement Models with Amazon SageMaker RL – Learn about Amazon SageMaker RL to use reinforcement learning and build intelligent applications for your businesses.

February 21, 2019 | 11:00 AM – 12:00 PM PTTrain ML Models Once, Run Anywhere in the Cloud & at the Edge with Amazon SageMaker Neo – Learn about Amazon SageMaker Neo where you can train ML models once and run them anywhere in the cloud and at the edge.

February 28, 2019 | 11:00 AM – 12:00 PM PTBuild your Machine Learning Datasets with Amazon SageMaker Ground Truth – Learn how customers are using Amazon SageMaker Ground Truth to build highly accurate training datasets for machine learning quickly and reduce data labeling costs by up to 70%.

Migration

February 27, 2019 | 11:00 AM – 12:00 PM PTMaximize the Benefits of Migrating to the Cloud – Learn how to group and rationalize applications and plan migration waves in order to realize the full set of benefits that cloud migration offers.

Networking

February 27, 2019 | 9:00 AM – 10:00 AM PTSimplifying DNS for Hybrid Cloud with Route 53 Resolver – Learn how to enable DNS resolution in hybrid cloud environments using Amazon Route 53 Resolver.

Productivity & Business Solutions

February 26, 2019 | 11:00 AM – 12:00 PM PTTransform the Modern Contact Center Using Machine Learning and Analytics – Learn how to integrate Amazon Connect and AWS machine learning services, such Amazon Lex, Amazon Transcribe, and Amazon Comprehend, to quickly process and analyze thousands of customer conversations and gain valuable insights.

Serverless

February 19, 2019 | 11:00 AM – 12:00 PM PTBest Practices for Serverless Queue Processing – Learn the best practices of serverless queue processing, using Amazon SQS as an event source for AWS Lambda.

Storage

February 25, 2019 | 11:00 AM – 12:00 PM PT Introducing AWS Backup: Automate and Centralize Data Protection in the AWS Cloud – Learn about this new, fully managed backup service that makes it easy to centralize and automate the backup of data across AWS services in the cloud as well as on-premises.

Categories: Cloud

Podcast 294: [Public Sector Special Series #4] – Using AI to make Content Available for Students at Imperial College of London

AWS Blog - Wed, 01/30/2019 - 11:31

How do you train the next generation of Digital leaders? How do you provide them with a modern educational experience? Can you do it without technical expertise? Hear how Ruth Black (Teaching Fellow at the Digital Academy) applied Amazon Transcribe to make this real.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives, and interviews.

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Subscribe with one of the following:

 

Categories: Cloud

Podcast 293: Diving into Data with Amazon Athena

AWS Blog - Mon, 01/28/2019 - 13:51

Do you have lots of data to analyze? Is writing SQL a skill you have? Would you like to analyze massive amounts of data at low cost without capacity planning? In this episode, Simon shares how Amazon Athena can give you options you may not have considered before.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

New – TLS Termination for Network Load Balancers

AWS Blog - Thu, 01/24/2019 - 10:03

When you access a web site using the HTTPS protocol, a whole lot of interesting work (formally known as an SSL/TLS handshake) happens to create and maintain a secure communication channel. Your client (browser) and the web server work together to negotiate a mutually agreeable cipher, exchange keys, and set up a session key. Once established, both ends of the conversation use the session key to encrypt and decrypt all further traffic. Because the session key is unique to the conversation between the client and the server, a third party cannot decrypt the traffic or interfere with the conversation.

New TLS Termination
Today we are simplifying the process of building secure web applications by giving you the ability to make use of TLS (Transport Layer Security) connections that terminate at a Network Load Balancer (you can think of TLS as providing the “S” in HTTPS). This will free your backend servers from the compute-intensive work of encrypting and decrypting all of your traffic, while also giving you a host of other features and benefits:

Source IP Preservation – The source IP address and port is presented to your backend servers, even when TLS is terminated at the NLB. This is, as my colleague Colm says, “insane magic!”

Simplified Management – Using TLS at scale means that you need to take responsibility for distributing your server certificate to each backend server. This creates extra management work (sometimes involving a fleet of proxy servers), and also increases your attack surface due to the presence of multiple copies of the certificate. Today’s launch removes all of that complexity and gives you a central management point for your certificates. If you are using AWS Certificate Manager (ACM), your certificates will be stored securely, expired & rotated regularly, and updated automatically, all with no action on your part.

Zero-day Patching – The TLS protocol is complex and the implementations are updated from time to time in response to emerging threats. Terminating your connections at the NLB protects your backend servers and allows us to update your NLB in response to these threats. We make use of s2n, our security-focused , formally-verified implementation of the TLS/SSL protocols.

Improved Compliance – You can use built-in security policies to specify the cipher suites and protocol versions that are acceptable to your application. This will help you in your PCI and FedRAMP compliance effort, and will also allow you to achieve a perfect TLS score.

Classic Upgrade – If you are currently using a Classic Load Balancer for TLS termination, switching to a Network Load Balancer will allow you to scale more quickly in response to an increased load. You will also be able to make use of a static IP address for your NLB and to log the source IP address for requests.

Access Logs – You now have the ability to enable access logs for your Network Load Balancers and to direct them to the S3 bucket of your choice. The log entries include detailed information about the TLS protocol version, cipher suite, connection time, handshake time, and more.

Using TLS Termination
You can create a Network Load Balancer and make use of TLS termination in minutes! You can use the API (CreateLoadBalancer), CLI (create-load-balancer), the EC2 Console, or a AWS CloudFormation template. I’ll use the Console, and click Load Balancers to get started. Then I click Create in the Network Load Balancer area:

I enter a name (MyLB2) and choose TLS (Secure TCP) as the Load Balancer Protocol:

Then I choose one or more Availability Zones, and optionally choose and Elastic IP address for each one. I can also choose to tag my NLB. When I am all set, I click Next: Configure Security Settings to proceed:

On the next page, I can choose an existing certificate or upload a new one. I already have one for www.jeff-barr.com, so I’ll choose it. I also choose a security policy (more on that in a minute):

There are currently seven security policies to choose from. Each policy allows for the use of certain TLS versions and ciphers:

The describe-load-balancer-policies command can be used to learn more about the policies:

After choosing the certificate and the policy, I click Next:Configure Routing. I can choose the communication protocol (TCP or TLS) that will be used between my NLB and my targets. If I choose TLS, communication is encrypted; this allows you to make use of complete end-to-end encryption in transit:

The remainder of the setup process proceeds as usual, and I can start using my Network Load Balancer right away.

Available Now
TLS Termination is available now and you can start using it today in the US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and South America (São Paulo) Regions.

Jeff;

 

Categories: Cloud

Amazon WorkLink – Secure, One-Click Mobile Access to Internal Websites and Applications

AWS Blog - Wed, 01/23/2019 - 16:19

We want to make it easier for you and your colleagues to use your mobile devices to access internal corporate websites and applications. Our goal is to give your workforce controlled access to valuable intranet content while maintaining a strong security profile.

Introducing Amazon WorkLink
Today I would like to tell you about Amazon WorkLink. You get seamless access to internal websites and applications from your mobile device, with no need to modify or migrate any content. Amazon WorkLink is a fully managed, pay-as-you-go service that scales to meet the needs of any organization. It is easy to set up and run, and does not require you to migrate or modify your existing sites or content. You get full control over the domains that are accessible from mobile devices, and you can use your existing SAML-based Identity Provider (IdP) to manage your user base.

Amazon WorkLink gains access to your internal resources through a Virtual Private Cloud (VPC). The resources can exist within that VPC (for example, applications hosted on EC2 instance), in another VPC that is peered with it, or on-premises. In the on-premises case, the resources must be accessible via an IPsec tunnel, AWS Direct Connect, or the new AWS Transit Gateway. Applications running in a VPC can use AWS PrivateLink to access AWS services while keeping all traffic on the AWS network.

Your users get a secure, non-invasive browsing experience. Corporate content is rendered within the AWS Cloud and delivered to each device over a secure connection. We’re launching with support for devices that run iOS 12, with support for Android 6+ coming within weeks.

Inside Amazon WorkLink
Amazon WorkLink lets you associates domains with each WorkLink fleet that you create. For example, you could associate phones.example.com, payroll.example.com, and tickets.example.com to provide your users with access to your phone directory, payroll system and trouble ticketing system. When you associate a domain with a fleet, you need to prove to WorkLink that you control the domain. WorkLink will issue an SSL/TLS certificate for the domain and then establish and manage an endpoint to handle requests for the domain.

With the fleet created, you can use the email template provided by WorkLink to extend invitations to users. The users accept the invitations, install the WorkLink app, and sign in using their existing corporate identity.

The app installs itself as the first-tier DNS resolver and configures the device’s VPN connection so that it can access the WorkLink fleet. When a mobile user accesses a domain that is associated with their fleet, the requested content is fetched, rendered, delivered to the device in vector form across a TLS connection, and rendered in the user’s existing mobile browser. Your users can interact with the content as usual: zooming, scrolling, and typing all work as expected. All HTML, CSS, and JavaScript content is rendered in the cloud on a fleet of EC2 instances isolated from other AWS customers; no content is stored or cached by browsers on the local devices. Encrypted version of cookies are stored by the WorkLink app on the user devices. They are never decrypted on the devices but are sent back to resume sessions when a user gets a new cloud-rendering container. Traffic to and from domains that are not associated with WorkLink continues to flow as before, and does not go through WorkLink.

Setting Up Amazon WorkLink
Let’s walk through the process of setting up a WorkLink fleet. I don’t have a genuine corporate network or intranet, so I’ll have to wave my hands a bit. I open the Amazon WorkLink Console and click Create fleet to get started:

I give my fleet a programmatic name (my-fleet), a display name (MyFleet), and click Create fleet to proceed:

My fleet is created in seconds, and is ready for further setup:

I click my-fleet to proceed; I can see the mandatory and optional setup steps at a glance:

I click Link IdP to use my existing SAML-style identity provider, click Choose file to upload an XML document that describes my metadata provider, and again click Link IdP to proceed:

WorkLink validates and processes the document, and generates a service provider metadata document. I download that document, and pass it along to the operator of the identity provider. The provider, in turn, uses the document to finalize the SAML federation for the identity provider:

Next, I click Link network to link my users to my company content. I can create a new VPC, or I can use an existing one. Either way, I should choose subnets in two or more Availability Zones in order to maximize availability. The chosen subnets must have enough free IP addresses to support the number of users that will be accessing the fleet; WorkLink will create and manage an Elastic Network Interface (ENI) for each connected user. I’ll use my existing VPC:

With my identify provider configured and my network linked, I can click Associate domain to indicate that I want my users to be able to access it some content on my network. I enter the domain name, and click Next to proceed (let’s pretend that www.jeff-barr.com is an intranet site):

Now I need to prove that I have control over the domain. I can either modify the DNS configuration or I can respond to an email request. I’ll take the first option:

The console displays the necessary changes (an additional CNAME record) that I need to make to my domain:

I use Amazon Route 53 to maintain my DNS entries so it is easy to add the CNAME:

Amazon WorkLink will validate the DNS entry (this can take four or five hours; email is a bit quicker). I can repeat this step for all desired domains, and I can add even more later.

After my domain has been validated I click User invites to get an email invitation that I can send to my users:

Your users simply follow the directions and can start to enjoy remote access to the permitted sites and applications within minutes. For example:

Other powerful administrative features include the ability to set up and use device policies, and to configure delivery of audit logs to a new or existing Amazon Kinesis Data Stream:

Things to Know
Here are a couple of things to keep in mind when evaluating Amazon WorkLink:

Device Support – We are launching with support for devices that run iOS 12. Support for Android 6 devices will be ready within weeks.

Compatibility – Amazon WorkLink is designed to process and render most modern forms of web content, with support for video and audio on the drawing board. It does not support content that makes use of Flash, Silverlight, WebGL, or applets.

Identity Providers – Amazon WorkLink can be used with SAML-based identity providers today, with plans to support other types of providers based on customer requests and feedback.

Regions – You can create Amazon WorkLink fleets in AWS regions in North America and Europe today. Support for other regions is in the works for rollout later this year.

Pricing – Pricing is based on the number of users with an active browser session in a given month. You pay $5 per active user per month.

Available Now
Amazon WorkLink is available now and you can start using it today!

Jeff;

 

Categories: Cloud

Podcast 292: [Public Sector Special Series #3] – Moving to Microservices from an Organisational Standpoint | January 23, 2019

AWS Blog - Wed, 01/23/2019 - 13:48

Jeff Olson (VP & Chief Data Officer at College Board) talks about his experiences in fostering change from an organisational standpoint whilst moving to a microservices architecture.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Podcast 291 | January 2019 Update Show

AWS Blog - Mon, 01/21/2019 - 13:53

Simon takes you through a nice mix of updates and new things to take advantage of – even a price drop!
Chapters:

Service Level Agreements 00:19
Price Reduction 1:15
Databases 2:09
Service Region Expansion 3:52
Analytics 5:23
Machine Learning 7:13
Compute 7:55
IoT 9:37
Management 10:43
Mobile 11:33
Desktop 12:30
Certification 13:11

Additional Resources

Topic || Service Level Agreements 00:19

Topic || Price Reduction 1:15

Topic || Databases 2:09

Topic || Service Region Expansion 3:52

Topic || Analytics 5:23

Topic || Machine Learning 7:13

Topic || Compute 7:55

Topic || IoT 9:37

Topic || Management 10:43

Topic || Mobile 11:33

Topic || Certification 13:11

  • About the AWS Podcast

    The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

    Like the Podcast?

    Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

AWS Backup – Automate and Centrally Manage Your Backups

AWS Blog - Wed, 01/16/2019 - 16:05

AWS gives you the power to easily and dynamically create file systems, block storage volumes, relational databases, NoSQL databases, and other resources that store precious data. You can create them on a moment’s notice as the need arises, giving you access to as much storage as you need and opening the door to large-scale cloud migration. When you bring your sensitive data to the cloud, you need to make sure that you continue to meet business and regulatory compliance requirements, and you definitely want to make sure that you are protected against application errors.

While you can build your own backup tools using the built-in snapshot operations built in to many of the services that I listed above, creating an enterprise wide backup strategy and the tools to implement it still takes a lot of work. We are changing that.

New AWS Backup
AWS Backup is designed to help you automate and centrally manage your backups. You can create policy-driven backup plans, monitor the status of on-going backups, verify compliance, and find / restore backups, all using a central console. Using a combination of the existing AWS snapshot operations and new, purpose-built backup operations, Backup backs up EBS volumes, EFS file systems, RDS & Aurora databases, DynamoDB tables, and Storage Gateway volumes to Amazon Simple Storage Service (S3), with the ability to tier older backups to Amazon Glacier. Because Backup includes support for Storage Gateway volumes, you can include your existing, on-premises data in the backups that you create.

Each backup plan includes one or more backup rules. The rules express the backup schedule, frequency, and backup window. Resources to be backed-up can be identified explicitly or in a policy-driven fashion using tags. Lifecycle rules control storage tiering and expiration of older backups. Backup gathers the set of snapshots and the metadata that goes along with the snapshots into collections that define a recovery point. You get lots of control so that you can define your daily / weekly / monthly backup strategy, the ability to rest assured that your critical data is being backed up in accord with your requirements, and the ability to restore that data on an as-needed data. Backups are grouped into vaults, each encrypted by a KMS key.

Using AWS Backup
You can get started with AWS Backup in minutes. Open the AWS Backup Console and click Create backup plan:

I can build a plan from scratch, start from an existing plan or define one using JSON. I’ll Build a new plan, and start by giving my plan a name:

Now I create the first rule for my backup plan. I call it MainBackup, indicate that I want it to run daily, define the lifecycle (transition to cold storage after 1 month, expire after 6 months), and select the Default vault:

I can tag the recovery points that are created as a result of this rule, and I can also tag the backup plan itself:

I’m all set, so I click Create plan to move forward:

At this point my plan exists and is ready to run, but it has just one rule and does not have any resource assignments (so there’s nothing to back up):

Now I need to indicate which of my resources are subject to this backup plan I click Assign resources, and then create one or more resource assignments. Each assignment is named and references an IAM role that is used to create the recovery point. Resources can be denoted by tag or by resource ID, and I can use both in the same assignment. I enter all of the values and click Assign resources to wrap up:

The next step is to wait for the first backup job to run (I cheated by editing my backup window in order to get this post done as quickly as possible). I can peek at the Backup Dashboard to see the overall status:

Backups On Demand
I also have the ability to create a recovery point on demand for any of my resources. I choose the desired resource and designate a vault, then click Create an on-demand backup:

I indicated that I wanted to create the backup right away, so a job is created:

The job runs to completion within minutes:

Inside a Vault
I can also view my collection of vaults, each of which contains multiple recovery points:

I can examine see the list of recovery points in a vault:

I can inspect a recovery point, and then click Restore to restore my table (in this case):

I’ve shown you the highlights, and you can discover the rest for yourself!

Things to Know
Here are a couple of things to keep in mind when you are evaluating AWS Backup:

Services – We are launching with support for EBS volumes, RDS databases, DynamoDB tables, EFS file systems, and Storage Gateway volumes. We’ll add support for additional services over time, and welcome your suggestions. Backup uses the existing snapshot operations for all services except EFS file systems.

Programmatic Access – You can access all of the functions that I showed you above using the AWS Command Line Interface (CLI) and the AWS Backup APIs. The APIs are powerful integration points for your existing backup tools and scripts.

Regions – Backups work within the scope of a particular AWS Region, with plans in the works to enable several different types of cross-region functionality in 2019.

Pricing – You pay the normal AWS charges for backups that are created using the built-in AWS snapshot facilities. For Amazon EFS, there’s a low, per-GB charge for warm storage and an even lower charge for cold storage.

Available Now
AWS Backup is available now and you can start using it today!

Jeff;

 

 

Categories: Cloud

#290: [Public Sector Special Series #2] – Using AWS to Power Research with the University of Liverpool and Alces Flight

AWS Blog - Wed, 01/16/2019 - 13:50

Cliff Addison (University of Liverpool) joins with Will Mayers and Cristin Merritt (Alces Flight) to talk about High Performance Computing in the cloud and meeting the needs of researchers.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

Behind the Scenes & Under the Carpet – The CenturyLink Network that Powered AWS re:Invent 2018

AWS Blog - Wed, 01/16/2019 - 08:16

If you are a long-time reader, you may have already figured out that I am fascinated by the behind-the-scenes and beneath-the-streets activities that enable and power so much of our modern world. For example, late last year I told you how The AWS Cloud Goes Underground at re:Invent and shared some information about the communication and network infrastructure that was used to provide top-notch connectivity to re:Invent attendees and to those watching the keynotes and live streams from afar.

Today, with re:Invent 2018 in the rear-view mirror (and planning for next year already underway), I would like to tell you how 5-time re:Invent Network Services Provider CenturyLink designed and built a redundant, resilient network that used AWS Direct Connect to provide 180 Gbps of bandwidth and supported over 81,000 devices connected across eight venues. Above the ground, we worked closely with ShowNets to connect their custom network and WiFi deployment in each venue to the infrastructure provided by CenturyLink.

The 2018 re:Invent Network
This year, the network included diverse routes to multiple AWS regions, with a brand-new multi-node metro fiber ring that encompassed the Sands Expo, Wynn Resort, Circus Circus, Mirage, Vdara, Bellagio, Aria, and MGM Grand facilities. Redundant 10 Gbps connections to each venue and to multiple AWS Direct Connect locations were used to ensure high availability. The network was provisioned using CenturyLink Cloud Connect Dynamic Connections.

Here’s a network diagram (courtesy of CenturyLink) that shows the metro fiber ring and the connectivity:

The network did its job, and supported keynotes, live streams, breakout sessions, hands-on labs, hackathons, workshops, and certification exams. Here are the final numbers, as measured on-site at re:Invent 2018:

  • Live Streams – Over 60K views from over 100 countries.
  • Peak Data Transfer – 9.5 Gbps across six 10 Gbps connections.
  • Total Data Transfer – 160 TB.

Thanks again to our Managed Service Partner for building and running the robust network that supported our customers, partners, and employees at re:Invent!

Jeff;

Categories: Cloud

Podcast #289: A Look at Amazon FSx For Windows File Server

AWS Blog - Tue, 01/15/2019 - 14:27

In this episode, Simon speaks with Andrew Crudge (Senior Product Manager, FSx) about this newly released service, capabilities available to customers and how to make the best use of it in your environment.

Additional Resources About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to awspodcast@amazon.com. We want to hear from you!

Categories: Cloud

New – Amazon DocumentDB (with MongoDB Compatibility): Fast, Scalable, and Highly Available

AWS Blog - Wed, 01/09/2019 - 14:51

A glance at the AWS Databases page will show you that we offer an incredibly wide variety of databases, each one purpose-built to address a particular need! In order to help you build the coolest and most powerful applications, you can mix and match relational, key-value, in-memory, graph, time series, and ledger databases.

Introducing Amazon DocumentDB (with MongoDB compatibility)
Today we are launching Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that is designed to be compatible with your existing MongoDB applications and tools. Amazon DocumentDB uses a purpose-built SSD-based storage layer, with 6x replication across 3 separate Availability Zones. The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads.

Each MongoDB database contains a set of collections. Each collection (similar to a relational database table) contains a set of documents, each in the JSON-like BSON format. For example:

{ name: "jeff", full_name: {first: "jeff", last: "barr"}, title: "VP, AWS Evangelism", email: "jbarr@amazon.com", city: "Seattle", foods: ["chocolate", "peanut butter"] }

Each document can have a unique set of field-value pairs and data; there are no fixed or predefined schemas. The MongoDB API includes the usual CRUD (create, read, update, and delete) operations along with a very rich query model. This is just the tip of the iceberg (the MongoDB API is very powerful and flexible), so check out the list of supported MongoDB operations, data types, and functions to learn more.

All About Amazon DocumentDB
Here’s what you need to know about Amazon DocumentDB:

Compatibility – Amazon DocumentDB is compatible with version 3.6 of MongoDB.

Scalability – Storage can be scaled from 10 GB up to 64 TB in increments of 10 GB. You don’t need to preallocate storage or monitor free space; Amazon DocumentDB will take care of that for you. You can choose between six instance sizes (15.25 GiB to 488 GiB of memory), and you can create up to 15 read replicas. Storage and compute are decoupled and you can scale each one independently and as-needed.

Performance – Amazon DocumentDB stores database changes as a log stream, allowing you to process millions of reads per second with millisecond latency. The storage model provides a nice performance increase without compromising data durability, and greatly enhances overall scalability.

Reliability – The 6-way storage replication ensures high availability. Amazon DocumentDB can failover from a primary to a replica within 30 seconds, and supports MongoDB replica set emulation so applications can handle failover quickly.

Fully Managed – Like the other AWS database services, Amazon DocumentDB is fully managed, with built-in monitoring, fault detection, and failover. You can set up daily snapshot backups, take manual snapshots, and use either one to create a fresh cluster if necessary. You can also do point-in-time restores (with second-level resolution) to any point within the 1-35 day backup retention period.

Secure – You can choose to encrypt your active data, snapshots, and replicas with the KMS key of your choice when you create each of your Amazon DocumentDB clusters. Authentication is enabled by default, as is encryption of data in transit.

Compatible – As I said earlier, Amazon DocumentDB is designed to work with your existing MongoDB applications and tools. Just be sure to use drivers intended for MongoDB 3.4 or newer. Internally, Amazon DocumentDB implements the MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server.

Creating An Amazon DocumentDB (with MongoDB compatibility) Cluster
You can create a cluster from the Console, Command Line, CloudFormation, or by making a call to the CreateDBCluster function. I’ll use the Amazon DocumentDB Console today. I open the console and click Launch Amazon DocumentDB to get started:

I name my cluster, choose the instance class, and specify the number of instances (one is the primary and the rest are replicas). Then I enter a master username and password:

I can use any of the following instance classes for my cluster:

At this point I can click Create cluster to use default settings, or I can click Show advanced settings for additional control. I can choose any desired VPC, subnets, and security group. I can also set the port and parameter group for the cluster:

I can control encryption (enabled by default), set the backup retention period, and establish the backup window for point-in-time restores:

I can also control the maintenance window for my new cluster. Once I am ready I click Create cluster to proceed:

My cluster starts out in creating status, and switches to available very quickly:

As do the instances in the cluster:

Connecting to a Cluster
With the cluster up and running, I install the mongo shell on an EC2 instance (details depend on your distribution) and fetch a certificate so that I can make a secure connection:

$ wget https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem

The console shows me the command that I need to use to make the connection:

I simply customize the command with the password that I specified when I created the cluster:

From there I can use any of the mongo shell commands to insert, query, and examine data. I inserted some very simple documents and then ran an equally simple query (I’m sure you can do a lot better):

Now Available
Amazon DocumentDB (with MongoDB compatibility) is available now and you can start using it today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) Regions. Pricing is based on the instance class, storage consumption for current documents and snapshots, I/O operations, and data transfer.

Jeff;

Categories: Cloud

Western Digital HDD Simulation at Cloud Scale – 2.5 Million HPC Tasks, 40K EC2 Spot Instances

AWS Blog - Tue, 01/08/2019 - 08:49

Earlier this month my colleague Bala Thekkedath published a story about Extreme Scale HPC and talked about how AWS customer Western Digital built a cloud-scale HPC cluster on AWS and used it to simulate crucial elements of upcoming head designs for their next-generation hard disk drives (HDD).

The simulation described in the story encompassed a little over 2.5 million tasks, and ran to completion in just 8 hours on a million-vCPU Amazon EC2 cluster. As Bala shared in his story, much of the simulation work at Western Digital revolves around the need to evaluate different combinations of technologies and solutions that comprise an HDD. The engineers focus on cramming ever-more data into the same space, improving storage capacity and increasing transfer speed in the process. Simulating millions of combinations of materials, energy levels, and rotational speeds allows them to pursue the highest density and the fastest read-write times. Getting the results more quickly allows them to make better decisions and lets them get new products to market more rapidly than before.

Here’s a visualization of Western Digital’s energy-assisted recording process in action. The top stripe represents the magnetism; the middle one represents the added energy (heat); and the bottom one represents the actual data written to the medium via the combination of magnetism and heat:

I recently spoke to my colleagues and to the teams at Western Digital and Univa who worked together to make this record-breaking run a reality. My goal was to find out more about how they prepared for this run, see what they learned, and to share it with you in case you are ready to run a large-scale job of your own.

Ramping Up
About two years ago, the Western Digital team was running clusters as big as 80K vCPUs, powered by EC2 Spot Instances in order to be as cost-effective as possible. They had grown to the 80K vCPU level after repeated, successful runs with 8K, 16K, and 32K vCPUs. After these early successes, they decided to shoot for the moon, push the boundaries, and work toward a one million vCPU run. They knew that this would stress and tax their existing tools, and settled on a find/fix/scale-some-more methodology.

Univa’s Grid Engine is a batch scheduler. It is responsible for keeping track of the available compute resources (EC2 instances) and dispatching work to the instances as quickly and efficiently as possible. The goal is to get the job done in the smallest amount of time and at the lowest cost. Univa’s Navops Launch supports container-based computing and also played a critical role in this run by allowing the same containers to be used for Grid Engine and AWS Batch.

One interesting scaling challenge arose when 50K hosts created concurrent connections to the Grid Engine scheduler. Once running, the scheduler can dispatch up to 3000 tasks per second, with an extra burst in the (relatively rare) case that an instance terminates unexpectedly and signals the need to reschedule 64 or more tasks as quickly as possible. The team also found that referencing worker instances by IP addresses allowed them to sidestep some internal (AWS) rate limits on the number of DNS lookups per Elastic Network Interface.

The entire simulation is packed in a Docker container for ease of use. When newly launched instances come online they register their specs (instance type, IP address, vCPU count, memory, and so forth) in an ElastiCache for Redis cluster. Grid Engine uses this data to find and manage instances; this is more efficient and scalable than calling DescribeInstances continually.

The simulation tasks read and write data from Amazon Simple Storage Service (S3), taking advantage of S3’s ability to store vast amounts of data and to handle any conceivable request rate.

Inside a Simulation Task
Each potential head design is described by a collection of parameters; the overall simulation run consists of an exploration of this parameter space. The results of the run help the designers to find designs that are buildable, reliable, and manufacturable. This particular run focused on modeling write operations.

Each simulation task ran for 2 to 3 hours, depending on the EC2 instance type. In order to avoid losing work if a Spot Instance is about to be terminated, the tasks checkpoint themselves to S3 every 15 minutes, with a bit of extra logic to cover the important case where the job finishes after the termination signal but before the actual shutdown.

Making the Run
After just 6 weeks of planning and prep (including multiple large-scale AWS Batch runs to generate the input files), the combined Western Digital / Univa / AWS team was ready to make the full-scale run. They used an AWS CloudFormation template to start Grid Engine and launch the cluster. Due to the Redis-based tracking that I described earlier, they were able to start dispatching tasks to instances as soon as they became available. The cluster grew to one million vCPUs in 1 hour and 32 minutes and ran full-bore for 6 hours:

When there were no more undispatched tasks available, Grid Engine began to shut the instances down, reaching the zero-instance point in about an hour. During the run, Grid Engine was able to keep the instances fully supplied with work over 99% of the time. The run used a combination of C3, C4, M4, R3, R4, and M5 instances. Here’s the overall breakdown over the course of the run:

The job spanned all six Availability Zones in the US East (N. Virginia) Region. Spot bids were placed at the On-Demand price. Over the course of the run, about 1.5% of the instances in the fleet were terminated and automatically replaced; the vast majority of the instances stayed running for the entire time.

And That’s That
This job ran 8 hours and cost $137,307 ($17,164 per hour). The folks I talked to estimated that this was about half the cost of making the run on an in-house cluster, if they had one of that size!

Evaluating the success of the run, Steve Phillpott (CIO of Western Digital) told us:

“Storage technology is amazingly complex and we’re constantly pushing the limits of physics and engineering to deliver next-generation capacities and technical innovation. This successful collaboration with AWS shows the extreme scale, power and agility of cloud-based HPC to help us run complex simulations for future storage architecture analysis and materials science explorations. Using AWS to easily shrink simulation time from 20 days to 8 hours allows Western Digital R&D teams to explore new designs and innovations at a pace un-imaginable just a short time ago.”

The Western Digital team behind this one is hiring an R&D Engineering Technologist; they also have many other open positions!

A Run for You
If you want to do a run on the order of 100K to 1M cores (or more), our HPC team is ready to help, as are our friends at Univa. To get started, Contact HPC Sales!

Jeff;

Categories: Cloud

Pages

Subscribe to LAMP, Database and Cloud Technical Information aggregator - Cloud


Main menu 2

by Dr. Radut