Jump to Navigation

Feed aggregator

New – Machine Learning Inference at the Edge Using AWS Greengrass

AWS Blog - Wed, 04/04/2018 - 15:05

What happens when you combine the Internet of Things, Machine Learning, and Edge Computing? Before I tell you, let’s review each one and discuss what AWS has to offer.

Internet of Things (IoT) – Devices that connect the physical world and the digital one. The devices, often equipped with one or more types of sensors, can be found in factories, vehicles, mines, fields, homes, and so forth. Important AWS services include AWS IoT Core, AWS IoT Analytics, AWS IoT Device Management, and Amazon FreeRTOS, along with others that you can find on the AWS IoT page.

Machine Learning (ML) – Systems that can be trained using an at-scale dataset and statistical algorithms, and used to make inferences from fresh data. At Amazon we use machine learning to drive the recommendations that you see when you shop, to optimize the paths in our fulfillment centers, fly drones, and much more. We support leading open source machine learning frameworks such as TensorFlow and MXNet, and make ML accessible and easy to use through Amazon SageMaker. We also provide Amazon Rekognition for images and for video, Amazon Lex for chatbots, and a wide array of language services for text analysis, translation, speech recognition, and text to speech.

Edge Computing – The power to have compute resources and decision-making capabilities in disparate locations, often with intermittent or no connectivity to the cloud. AWS Greengrass builds on AWS IoT, giving you the ability to run Lambda functions and keep device state in sync even when not connected to the Internet.

ML Inference at the Edge
Today I would like to toss all three of these important new technologies into a blender! You can now perform Machine Learning inference at the edge using AWS Greengrass. This allows you to use the power of the AWS cloud (including fast, powerful instances equipped with GPUs) to build, train, and test your ML models before deploying them to small, low-powered, intermittently-connected IoT devices running in those factories, vehicles, mines, fields, and homes that I mentioned.

Here are a few of the many ways that you can put Greengrass ML Inference to use:

Precision Farming – With an ever-growing world population and unpredictable weather that can affect crop yields, the opportunity to use technology to increase yields is immense. Intelligent devices that are literally in the field can process images of soil, plants, pests, and crops, taking local corrective action and sending status reports to the cloud.

Physical Security – Smart devices (including the AWS DeepLens) can process images and scenes locally, looking for objects, watching for changes, and even detecting faces. When something of interest or concern arises, the device can pass the image or the video to the cloud and use Amazon Rekognition to take a closer look.

Industrial Maintenance – Smart, local monitoring can increase operational efficiency and reduce unplanned downtime. The monitors can run inference operations on power consumption, noise levels, and vibration to flag anomalies, predict failures, detect faulty equipment.

Greengrass ML Inference Overview
There are several different aspects to this new AWS feature. Let’s take a look at each one:

Machine Learning ModelsPrecompiled TensorFlow and MXNet libraries, optimized for production use on the NVIDIA Jetson TX2 and Intel Atom devices, and development use on 32-bit Raspberry Pi devices. The optimized libraries can take advantage of GPU and FPGA hardware accelerators at the edge in order to provide fast, local inferences.

Model Building and Training – The ability to use Amazon SageMaker and other cloud-based ML tools to build, train, and test your models before deploying them to your IoT devices. To learn more about SageMaker, read Amazon SageMaker – Accelerated Machine Learning.

Model Deployment – SageMaker models can (if you give them the proper IAM permissions) be referenced directly from your Greengrass groups. You can also make use of models stored in S3 buckets. You can add a new machine learning resource to a group with a couple of clicks:

These new features are available now and you can start using them today! To learn more read Perform Machine Learning Inference.

Jeff;

 

Categories: Cloud

New – Encryption of Data in Transit for Amazon EFS

AWS Blog - Wed, 04/04/2018 - 15:03

Amazon Elastic File System was designed to be the file system of choice for cloud-native applications that require shared access to file-based storage. We launched EFS in mid-2016 and have added several important features since then including on-premises access via Direct Connect and encryption of data at rest. We have also made EFS available in additional AWS Regions, most recently US West (Northern California). As was the case with EFS itself, these enhancements were made in response to customer feedback, and reflect our desire to serve an ever-widening customer base.

Encryption in Transit
Today we are making EFS even more useful with the addition of support for encryption of data in transit. When used in conjunction with the existing support for encryption of data at rest, you now have the ability to protect your stored files using a defense-in-depth security strategy.

In order to make it easy for you to implement encryption in transit, we are also releasing an EFS mount helper. The helper (available in source code and RPM form) takes care of setting up a TLS tunnel to EFS, and also allows you to mount file systems by ID. The two features are independent; you can use the helper to mount file systems by ID even if you don’t make use of encryption in transit. The helper also supplies a recommended set of default options to the actual mount command.

Setting up Encryption
I start by installing the EFS mount helper on my Amazon Linux instance:

$ sudo yum install -y amazon-efs-utils

Next, I visit the EFS Console and capture the file system ID:

Then I specify the ID (and the TLS option) to mount the file system:

$ sudo mount -t efs fs-92758f7b -o tls /mnt/efs

And that’s it! The encryption is transparent and has an almost negligible impact on data transfer speed.

Available Now
You can start using encryption in transit today in all AWS Regions where EFS is available.

The mount helper is available for Amazon Linux. If you are running another distribution of Linux you will need to clone the GitHub repo and build your own RPM, as described in the README.

Jeff;

Categories: Cloud

AWS Certificate Manager Launches Private Certificate Authority

AWS Blog - Wed, 04/04/2018 - 11:13

Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.

Provisioning a Private Certificate Authority (CA)

First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.

Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.

Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.

In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custom domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.

Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.

A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.

Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).

Now I can use a tool like openssl to sign my cert and generate the certificate chain.

$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem Using configuration from openssl_root.cnf Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem: Check that the request matches the signature Signature ok The Subject's Distinguished Name is as follows stateOrProvinceName :ASN.1 12:'Washington' localityName :ASN.1 12:'Seattle' organizationName :ASN.1 12:'Amazon' organizationalUnitName:ASN.1 12:'Engineering' commonName :ASN.1 12:'secure.internal' Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days) Sign the certificate? [y/n]:y 1 out of 1 certificate requests certified, commit? [y/n]y Write out database with 1 new entries Data Base Updated

After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.

Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.

Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.

From there it’s just similar to provisioning a normal certificate in ACM.

Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.

Available Now
ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.

I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.

Randall

Categories: Cloud

AWS Firewall Manager: Central Management for Your Web Application Portfolio

AWS Blog - Wed, 04/04/2018 - 11:09

There’s often tension between distributed and centralized control, especially in larger organizations. While a distributed control model allows teams to move fast and to respond to specialized local needs, a central model can provide the right level of oversight for global initiatives and challenges that span all teams.

We’ve seen this challenge arise first-hand when AWS customers grow to the point where their application footprint encompasses a plethora of AWS regions, AWS accounts, development teams, and applications. They love the fact that AWS increases their agility and responsiveness, while letting them deploy resources in the most appropriate location. This diversity and scale brings new challenges when it comes to security and compliance. The freedom to innovate must be balanced by the need to protect important data and to respond quickly when threats emerge.

Over the last couple of years we have provided our customers with an increasingly broad set of options for protection including AWS WAF and AWS Shield. Our customers are making great use of all of these options, and have asked for the ability to manage them from a single, central location.

Meet AWS Firewall Manager
AWS Firewall Manager is designed to help these customers! It gives them the freedom to use multiple AWS accounts and to host applications in any desired region while maintaining centralized control over their organization’s security settings and profile. Developers can develop and innovators can innovate, while the security team gains the ability to respond quickly, uniformly, and globally to potential threats and actual attacks.

With automated policy enforcement across accounts & applications, your security team can be confident that new and existing applications comply with organization-wide security policies when they use Firewall Manager. They can find applications and AWS resources that don’t measure up, and bring them into compliance in minutes.

Firewall Manager is built around named policies that contain WAF rule sets and optional AWS Shield advanced protection. Each policy applies to a specific set of AWS resources, specified by account, resource type, resource identifier, or tag. Policies can be applied automatically to all matching resources, or to a subset that you select. Policies can include WAF rules drawn from within the organization, and also those created by AWS Partners such as Imperva, F5, Trend Micro, and other AWS Marketplace vendors. This gives your security team the power to duplicate their existing on-premises security posture in the cloud.

Take the Tour
Firewall Manager has three prerequisites:

AWS Organizations – Your organization must be using AWS Organizations to manage your accounts and all features must be enabled. To learn more, read Creating an Organization.

Firewall Administrator – You must designate one of the AWS accounts in your organization as the administrator for Firewall Manager. This gives the account permission to deploy AWS WAF rules across the organization.

AWS Config – You must enable AWS Config for all of the accounts in the Organization so that Firewall Manager can detect newly created resources (you can use the Enable AWS Config template on the StackSets Sample Templates page to take care of this). To learn more, read Getting Started with AWS Config.

Since I don’t own an enterprise, my colleagues were kind enough to create some test accounts for me! When I open the Firewall Manager Console in the master account, I can see where I stand with respect to the first two prerequisites:

The Learn more about… button reveals the Account ID of the administrator:

I switch to that account (in a a real-world situation it is unlikely that I would have access to the master account and this one), open the console, and see that I now meet the prerequisites. I click Create policy to move ahead:

The console outlines the process for me. I need to create rules and a rule group, define a policy with the rule group, define the scope of the policy, and then actually create the policy.

At the bottom of the page I choose to create a new policy and rule group, for resources in the US East (N. Virginia) Region, and click Next:

Then I specify the conditions for my rule, choosing from the following options:

  • Cross-site scripting
  • Geographic origin
  • SQL injection
  • IP address or range
  • Size constraint
  • String or regular expression

For example, I can create a condition that blocks malicious IP addresses (this AWS Solution shows you how to use a third-party reputation list with WAF, and may be helpful):

I’ll keep this one simple, but a rule can include multiple conditions. After I have added all of them, I click Next to proceed. Now I am ready to create my rule, and I click Create rule (I can add more conditions to it later if I want):

I give my rule a name (BlockExcludedIPs), enter a CloudWatch metric name, and add my condition (ExcludeIPs), then click Create:

I can create more rules, and include them in the same rule group. Again, I’ll keep this one simple, and click Next to move ahead:

I enter a name for my group, choose the rules that will make up the group, and click Create:

I now have two rule groups (testRuleGroup was already present in the account). I name my policy and click Next to proceed:

Now I define the scope of my policy. I choose the type of resource to be protected, and indicate when the policy should be applied:

I can also use tags to include or exclude resources:

Once I have defined the scope of my policy I click Next and review it, then click Create policy:

Now that the policy is in force, the ALBs within its scope are initially noncompliant:

Within minutes, Firewall Manager applies the policy and provides me with a status report:

Start Using AWS Firewall Manager Today
You can start using AWS Firewall Manager today!

If you are using AWS Shield Advanced, you have access to AWS Firewall Manager and AWS WAF at no extra charge. If not, you are charged a monthly fee for each policy in each region, along with the usual charges for WAF WebACLs, WAF Rules, and AWS Config Rules.

Jeff;

Categories: Cloud

AWS Config Rules Update: Aggregate Compliance Data Across Accounts and Regions

AWS Blog - Wed, 04/04/2018 - 11:07

As I have discussed in the past, sophisticated AWS customers invariably control multiple AWS accounts. Some of these are the results of acquisitions or a holdover from bottom-up, departmental adoption of cloud computing. Others create multiple accounts in order to isolate developers, projects, or departments from each other. We strongly endorse this as a best practice, and back it up with cross-account features in many AWS services, as well as AWS Organizations for policy-based management that spans accounts. Many of these customers also make great use of AWS Config and use Config Rules (both their own and those supplied by Config) to check their AWS resources for compliance.

Aggregate Across Accounts and Regions
Today we are making Config Rules even more useful by adding the ability to aggregate the compliance data produced by their rules across multiple AWS accounts and/or Regions. The aggregated data can then be viewed in a single dashboard, making this a great way to improve governance and compliance. Even better, the aggregation and dashboard are available at no charge to all AWS Config users!

I’ll show you how to set this up in a moment. First, let’s define a couple of terms:

Aggregator – This is a new Config resource. It identifies the sources (accounts and regions) of the compliance data to be aggregated. Multiple aggregators can be used simultaneously, giving you the ability to fine-tune your governance and compliance model.

Aggregator account – This is an AWS account that owns one or more aggregators.

Source account – This is an AWS account that has compliance data to be aggregated.

Aggregated view – A dashboard that shows compliant and non-compliant rules for an aggregator.

Here’s how it all fits together:

Setting up Aggregation
Let’s set up aggregation for some AWS Config data! The first steps take place in the aggregator account. I open the Config Console, find the Aggregated View section and click Aggregators:

I review the list of Aggregators, and click Add aggregator to make a new one:

I grant AWS Config permission to replicate data from the source accounts and enter a name for my aggregator (MyAgg):

Next, I select the source accounts. I have three options here: I can manually add the account IDs, upload a file that contains a comma-separated list, or add all of the accounts in my AWS Organization:

I click on Add source accounts to manually add one account, enter the ID, and click Add source accounts:

Next, I choose the regions of interest, with the option to select current regions as well as future ones, then click Save to move ahead:

The next step takes place in the source account, within the Config Console. An Authorization request appears:

And I confirm it:

You can use CloudFormation StackSets to enable authorization programmatically across all source accounts. Also note that the authorization step is not needed if you choose to aggregate all accounts in your AWS Organization.

Compliance data from the source account begins to flow to the aggregator account and becomes visible in the Console, generally within 2-5 minutes:

As you can see, I have a multitude of filtering options! I can focus the view on a particular region or account, and I can see which rules or accounts have the most issues to address. For example, I can see all of the buckets that do not have server-side encryption enabled:

I can also look at the overall compliance situation for an account, seeing both compliant and non-compliant resources:

Things to Know
This new feature is available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), EU (Ireland), EU (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Sydney), and Asia Pacific (Singapore) regions at no charge and you can start using it today. You pay for the use of Config and Config Rules as usual.

The multi-account, multi-region data aggregation capability in AWS Config allows you to view the compliance status of your accounts from a central account. It assumes that you have already enabled Config and Config Rules across your accounts (you can use CloudFormation StackSets to distribute and deploy your Config Rules across multiple accounts).

Jeff;

Categories: Cloud

AWS Secrets Manager: Store, Distribute, and Rotate Credentials Securely

AWS Blog - Wed, 04/04/2018 - 11:05

Today we’re launching AWS Secrets Manager which makes it easy to store and retrieve your secrets via API or the AWS Command Line Interface (CLI) and rotate your credentials with built-in or custom AWS Lambda functions. Managing application secrets like database credentials, passwords, or API Keys is easy when you’re working locally with one machine and one application. As you grow and scale to many distributed microservices, it becomes a daunting task to securely store, distribute, rotate, and consume secrets. Previously, customers needed to provision and maintain additional infrastructure solely for secrets management which could incur costs and introduce unneeded complexity into systems.

AWS Secrets Manager

Imagine that I have an application that takes incoming tweets from Twitter and stores them in an Amazon Aurora database. Previously, I would have had to request a username and password from my database administrator and embed those credentials in environment variables or, in my race to production, even in the application itself. I would also need to have our social media manager create the Twitter API credentials and figure out how to store those. This is a fairly manual process, involving multiple people, that I have to restart every time I want to rotate these credentials. With Secrets Manager my database administrator can provide the credentials in secrets manager once and subsequently rely on a Secrets Manager provided Lambda function to automatically update and rotate those credentials. My social media manager can put the Twitter API keys in Secrets Manager which I can then access with a simple API call and I can even rotate these programmatically with a custom lambda function calling out to the Twitter API. My secrets are encrypted with the KMS key of my choice, and each of these administrators can explicitly grant access to these secrets with with granular IAM policies for individual roles or users.

Let’s take a look at how I would store a secret using the AWS Secrets Manager console. First, I’ll click Store a new secret to get to the new secrets wizard. For my RDS Aurora instance it’s straightforward to simply select the instance and provide the initial username and password to connect to the database.

Next, I’ll fill in a quick description and a name to access my secret by. You can use whatever naming scheme you want here.

Next, we’ll configure rotation to use the Secrets Manager-provided Lambda function to rotate our password every 10 days.

Finally, we’ll review all the details and check out our sample code for storing and retrieving our secret!

Finally I can review the secrets in the console.

Now, if I needed to access these secrets I’d simply call the API.

import json import boto3 secrets = boto3.client("secretsmanager") rds = json.dumps(secrets.get_secrets_value("prod/TwitterApp/Database")['SecretString']) print(rds)

Which would give me the following values:

{'engine': 'mysql', 'host': 'twitterapp2.abcdefg.us-east-1.rds.amazonaws.com', 'password': '-)Kw>THISISAFAKEPASSWORD:lg{&sad+Canr', 'port': 3306, 'username': 'ranman'} More than passwords

AWS Secrets Manager works for more than just passwords. I can store OAuth credentials, binary data, and more. Let’s look at storing my Twitter OAuth application keys.

Now, I can define the rotation for these third-party OAuth credentials with a custom AWS Lambda function that can call out to Twitter whenever we need to rotate our credentials.

Custom Rotation

One of the niftiest features of AWS Secrets Manager is custom AWS Lambda functions for credential rotation. This allows you to define completely custom workflows for credentials. Secrets Manager will call your lambda with a payload that includes a Step which specifies which step of the rotation you’re in, a SecretId which specifies which secret the rotation is for, and importantly a ClientRequestToken which is used to ensure idempotency in any changes to the underlying secret.

When you’re rotating secrets you go through a few different steps:

  1. createSecret
  2. setSecret
  3. testSecret
  4. finishSecret

The advantage of these steps is that you can add any kind of approval steps you want for each phase of the rotation. For more details on custom rotation check out the documentation.

Available Now
AWS Secrets Manager is available today in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo). Secrets are priced at $0.40 per month per secret and $0.05 per 10,000 API calls. I’m looking forward to seeing more users adopt rotating credentials to secure their applications!

Randall

Categories: Cloud

Amazon S3 Update: New Storage Class and General Availability of S3 Select

AWS Blog - Wed, 04/04/2018 - 10:45

I’ve got two big pieces of news for anyone who stores and retrieves data in Amazon Simple Storage Service (S3):

New S3 One Zone-IA Storage Class – This new storage class is 20% less expensive than the existing Standard-IA storage class. It is designed to be used to store data that does not need the extra level of protection provided by geographic redundancy.

General Availability of S3 Select – This unique retrieval option lets you retrieve subsets of data from S3 objects using simple SQL expressions, with the possibility for a 400% performance improvement in the process.

Let’s take a look at both!

S3 One Zone-IA (Infrequent Access) Storage Class
This new storage class stores data in a single AWS Availability Zone and is designed to provide eleven 9’s (99.99999999%) of data durability, just like the other S3 storage classes. Unlike those other classes, it is not designed to be resilient to the physical loss of an AZ due to major event such as an earthquake or a flood, and data could be lost in the unlikely event that an AZ is destroyed. S3 One Zone-IA storage gives you a lower cost option for secondary backups of on-premises data and for data that can be easily re-created. You can also use it as the target of S3 Cross-Region Replication from another AWS region.

You can specify the use of S3 One Zone-IA storage when you upload a new object to S3:

You can also make use of it as part of an S3 lifecycle rule:

You can set up a lifecycle rule that moves previous versions of an object to S3 One Zone-IA after 30 or more days:

And you can modify the storage class of an existing object:

You can also manage storage classes using the S3 API, CLI, and CloudFormation templates.

The S3 One Zone-IA storage class can be used in all public AWS regions. As I noted earlier, pricing is 20% lower than for the S3 Standard-IA storage class (see the S3 Pricing page for more info). There’s a 30 day minimum retention period, and a 128 KB minimum object size.

General Availability of S3 Select
Randall wrote a detailed introduction to S3 Select last year and showed you how you can use it to retrieve selected data from within S3 objects. During the preview we added support for server-side encryption and the ability to run queries from the S3 Console.

I used a CSV file of airport codes to exercise the new console functionality:

This file contains listings for over 9100 airports, so it makes for useful test data but it definitely does not test the limits of S3 Select in any way. I select the file, open the More menu, and choose Select from:

The console sets the file format and compression according to the file name and the encryption status. I set delimiter and click Show file preview to verify that my settings are correct. Then I click Next to proceed:

I type SQL expressions in the SQL editor and click Run SQL to issue the query:

Or:

I can also issue queries from the AWS SDKs. I initiate the select operation:

s3 = boto3.client('s3', region_name='us-west-2') r = s3.select_object_content( Bucket='jbarr-us-west-2', Key='sample-data/airportCodes.csv', ExpressionType='SQL', Expression="select * from s3object s where s.\"Country (Name)\" like '%United States%'", InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}}, OutputSerialization = {'CSV': {}}, )

And then I process the stream of results:

for event in r['Payload']: if 'Records' in event: records = event['Records']['Payload'].decode('utf-8') print(records) elif 'Stats' in event: statsDetails = event['Stats']['Details'] print("Stats details bytesScanned: ") print(statsDetails['BytesScanned']) print("Stats details bytesProcessed: ") print(statsDetails['BytesProcessed'])

S3 Select is available in all public regions and you can start using it today. Pricing is based on the amount of data scanned and the amount of data returned.

Jeff;

Categories: Cloud

Amazon Transcribe Now Generally Available

AWS Blog - Wed, 04/04/2018 - 10:02


At AWS re:Invent 2017 we launched Amazon Transcribe in private preview. Today we’re excited to make Amazon Transcribe generally available for all developers. Amazon Transcribe is an automatic speech recognition service (ASR) that makes it easy for developers to add speech to text capabilities to their applications. We’ve iterated on customer feedback in the preview to make a number of enhancements to Amazon Transcribe.

New Amazon Transcribe Features in GA

To start off we’ve made the SampleRate parameter optional which means you only need to know the file type of your media and the input language. We’ve added two new features – the ability to differentiate multiple speakers in the audio to provide more intelligible transcripts (“who spoke when”), and a custom vocabulary to improve the accuracy of speech recognition for product names, industry-specific terminology, or names of individuals. To refresh our memories on how Amazon Transcribe works lets look at a quick example. I’ll convert this audio in my S3 bucket.

import boto3 transcribe = boto3.client("transcribe") transcribe.start_transcription_job( TranscriptionJobName="TranscribeDemo", LanguageCode="en-US", MediaFormat="mp3", Media={"MediaFileUri": "https://s3.amazonaws.com/randhunt-transcribe-demo-us-east-1/out.mp3"} )

This will output JSON similar to this (I’ve stripped out most of the response) with indidivudal speakers identified:

{ "jobName": "reinvent", "accountId": "1234", "results": { "transcripts": [ { "transcript": "Hi, everybody, i'm randall ..." } ], "speaker_labels": { "speakers": 2, "segments": [ { "start_time": "0.000000", "speaker_label": "spk_0", "end_time": "0.010", "items": [] }, { "start_time": "0.010000", "speaker_label": "spk_1", "end_time": "4.990", "items": [ { "start_time": "1.000", "speaker_label": "spk_1", "end_time": "1.190" }, { "start_time": "1.190", "speaker_label": "spk_1", "end_time": "1.700" } ] } ] }, "items": [ { "start_time": "1.000", "end_time": "1.190", "alternatives": [ { "confidence": "0.9971", "content": "Hi" } ], "type": "pronunciation" }, { "alternatives": [ { "content": "," } ], "type": "punctuation" }, { "start_time": "1.190", "end_time": "1.700", "alternatives": [ { "confidence": "1.0000", "content": "everybody" } ], "type": "pronunciation" } ] }, "status": "COMPLETED" } Custom Vocabulary

Now if I needed to have a more complex technical discussion with a colleague I could create a custom vocabulary. A custom vocabulary is specified as an array of strings passed to the CreateVocabulary API and you can include your custom vocabulary in a transcription job by passing in the name as part of the Settings in a StartTranscriptionJob API call. An individual vocabulary can be as large as 50KB and each phrase must be less than 256 characters (use hyphens for spaces). More information on custom vocabularies can be found in the documentation. If I wanted to transcribe the recordings of my highschool AP Biology class I could create a custom vocabulary in Python like this:

import boto3 transcribe = boto3.client("transcribe") transcribe.create_vocabulary( LanguageCode="en-US", VocabularyName="APBiology", Phrases=[ "endoplasmic-reticulum", "organelle", "cisternae", "eukaryotic", "ribosomes", "hepatocyes", "cell-membrane" ] )

I can refer to this vocabulary in my calls to the synthesize speech API by the name APBiology, and I can update the vocabulary programatically based on any errors I may find in the transcriptions.

Available Now

Amazon Transcribe is available now in US East (N. Virginia), US West (Oregon), US East (Ohio) and EU (Ireland). Transcribe’s free tier gives you 60 minutes of transcription for free per month for the first 12 months with a pay-as-you-go model of $0.0004 per second of transcribed audio after that, with a minimum charge of 15 seconds.

When combined with other tools and services I think transcribe opens up a entirely new opportunities for application development. I’m excited to see what technologies developers build with this new service.

Randall

Categories: Cloud

Amazon SageMaker Now Supports Additional Instance Types, Local Mode, Open Sourced Containers, MXNet and Tensorflow Updates

AWS Blog - Wed, 04/04/2018 - 10:02

Amazon SageMaker continues to iterate quickly and release new features on behalf of customers. Starting today, SageMaker adds support for many new instance types, local testing with the SDK, and Apache MXNet 1.1.0 and Tensorflow 1.6.0. Let’s take a quick look at each of these updates.

New Instance Types

Amazon SageMaker customers now have additional options for right-sizing their workloads for notebooks, training, and hosting. Notebook instances now support almost all T2, M4, P2, and P3 instance types with the exception of t2.micro, t2.small, and m4.large instances. Model training now supports nearly all M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large, c4.large, and c5.large instances. Finally, model hosting now supports nearly all T2, M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large instances. Many customers can take advantage of the newest P3, C5, and M5 instances to get the best price/performance for their workloads. Customers also take advantage of the burstable compute model on T2 instances for endpoints or notebooks that are used less frequently.

Open Sourced Containers, Local Mode, and TensorFlow 1.6.0 and MXNet 1.1.0

Today Amazon SageMaker has open sourced the MXNet and Tensorflow deep learning containers that power the MXNet and Tensorflow estimators in the SageMaker SDK. The ability to write Python scripts that confrom to simple interface is still one of my favorite SageMaker features and now those containers can be additionally customized to include any additional libraries. You can download these containers locally to iterate and experiment which can accelerate your debugging cycle. When you’re ready go from local testing to production training and hosting you just change one line of code.

These containers launch with support for Tensorflow 1.6.0 and MXNet 1.1.0 as well. Tensorflow has a number of new 1.6.0 features including support for CUDA 9.0, cuDNN 7, and AVX instructions which allows for significant speedups in many training applications. MXNet 1.1.0 adds a number of new features including a Text API mxnet.text with support for text processing, indexing, glossaries, and more. Two of the really cool pre-trained embeddings included are GloVe and fastText.
<

Available Now
All of the features mentioned above are available today. As always please let us know on Twitter or in the comments below if you have any questions or if you’re building something interesting. Now, if you’ll excuse me I’m going to go experiment with some of those new MXNet APIs!

Randall

Categories: Cloud

Implementation Guide on Headless and Decoupled CMS

Drupal News - Mon, 04/02/2018 - 14:03

The following blog was written by Drupal Association Signature Hosting Supporter, Acquia

The rapid evolution of diverse end-user clients and applications has given rise to a dizzying array of digital channels to support.

Websites in the past were built from monolithic architectures utilizing web content management solutions that deliver content through a templating solution tightly “coupled” with the content management system on the back-end.

Agile organizations crave flexibility, and strive to manage structured content across different presentation layers consistently in a way that’s scalable.

Accomplishing this efficiently requires that teams have flexibility in the front-end frameworks that dominate the modern digital landscape. That’s why decoupled and headless CMS is taking off. That’s why you’re here. But now you need the right technology to support the next phase of the web and beyond.

Download this eBook on headless and decoupled CMS
Categories: Drupal

The 6th Annual China PHP Conference

PHP News - Fri, 03/30/2018 - 03:00
Categories: PHP

PHP 7.1.16 Released

PHP News - Thu, 03/29/2018 - 22:35
Categories: PHP

PHP 5.6.35 Released

PHP News - Thu, 03/29/2018 - 16:26
Categories: PHP

PHP 7.0.29 Released

PHP News - Thu, 03/29/2018 - 05:00
Categories: PHP

PHP 7.2.4 Released

PHP News - Thu, 03/29/2018 - 03:58
Categories: PHP

Drupal core - Highly critical - Remote Code Execution - SA-CORE-2018-002

Drupal News - Wed, 03/28/2018 - 11:14
Project: Drupal coreDate: 2018-March-28Security risk: Highly critical 21∕25 AC:None/A:None/CI:All/II:All/E:Theoretical/TD:DefaultVulnerability: Remote Code Execution Description: 

CVE: CVE-2018-7600

A remote code execution vulnerability exists within multiple subsystems of Drupal 7.x and 8.x. This potentially allows attackers to exploit multiple attack vectors on a Drupal site, which could result in the site being completely compromised.

The security team has written an FAQ about this issue.

Solution: 

Upgrade to the most recent version of Drupal 7 or 8 core.

  • If you are running 7.x, upgrade to Drupal 7.58. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)
  • If you are running 8.5.x, upgrade to Drupal 8.5.1. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)

Drupal 8.3.x and 8.4.x are no longer supported and we don't normally provide security releases for unsupported minor releases. However, given the potential severity of this issue, we are providing 8.3.x and 8.4.x releases that includes the fix for sites which have not yet had a chance to update to 8.5.0.

Your site's update report page will recommend the 8.5.x release even if you are on 8.3.x or 8.4.x. Please take the time to update to a supported version after installing this security update.

This issue also affects Drupal 8.2.x and earlier, which are no longer supported. If you are running any of these versions of Drupal 8, update to a more recent release and then follow the instructions above.

This issue also affects Drupal 6. Drupal 6 is End of Life. For more information on Drupal 6 support please contact a D6LTS vendor.

Reported By: Fixed By:  Contact and more information

The Drupal security team can be reached by email at security at drupal.org or via the contact form.

Learn more about the Drupal Security team and their policies, writing secure code for Drupal, and securing your site.

Categories: Drupal

Amazon ECS Service Discovery

AWS Blog - Wed, 03/28/2018 - 11:10

Amazon ECS now includes integrated service discovery. This makes it possible for an ECS service to automatically register itself with a predictable and friendly DNS name in Amazon Route 53. As your services scale up or down in response to load or container health, the Route 53 hosted zone is kept up to date, allowing other services to lookup where they need to make connections based on the state of each service. You can see a demo of service discovery in an imaginary social networking app over at: https://servicediscovery.ranman.com/.

Service Discovery


Part of the transition to microservices and modern architectures involves having dynamic, autoscaling, and robust services that can respond quickly to failures and changing loads. Your services probably have complex dependency graphs of services they rely on and services they provide. A modern architectural best practice is to loosely couple these services by allowing them to specify their own dependencies, but this can be complicated in dynamic environments as your individual services are forced to find their own connection points.

Traditional approaches to service discovery like consul, etcd, or zookeeper all solve this problem well, but they require provisioning and maintaining additional infrastructure or installation of agents in your containers or on your instances. Previously, to ensure that services were able to discover and connect with each other, you had to configure and run your own service discovery system or connect every service to a load balancer. Now, you can enable service discovery for your containerized services in the ECS console, AWS CLI, or using the ECS API.

Introducing Amazon Route 53 Service Registry and Auto Naming APIs

Amazon ECS Service Discovery works by communicating with the Amazon Route 53 Service Registry and Auto Naming APIs. Since we haven’t talked about it before on this blog, I want to briefly outline how these Route 53 APIs work. First, some vocabulary:

  • Namespaces – A namespace specifies a domain name you want to route traffic to (e.g. internal, local, corp). You can think of it as a logical boundary between which services should be able to discover each other. You can create a namespace with a call to the aws servicediscovery create-private-dns-namespace command or in the ECS console. Namespaces are roughly equivalent to hosted zones in Route 53. A namespace contains services, our next vocabulary word.
  • Service – A service is a specific application or set of applications in your namespace like “auth”, “timeline”, or “worker”. A service contains service instances.
  • Service Instance – A service instance contains information about how Route 53 should respond to DNS queries for a resource.

Route 53 provides APIs to create: namespaces, A records per task IP, and SRV records per task IP + port.

When we ask Route 53 for something like: worker.corp we should get back a set of possible IPs that could fulfill that request. If the application we’re connecting to exposes dynamic ports then the calling application can easily query the SRV record to get more information.

ECS service discovery is built on top of the Route 53 APIs and manages all of the underlying API calls for you. Now that we understand how the service registry, works lets take a look at the ECS side to see service discovery in action.

Amazon ECS Service Discovery

Let’s launch an application with service discovery! First, I’ll create two task definitions: “flask-backend” and “flask-worker”. Both are simple AWS Fargate tasks with a single container serving HTTP requests. I’ll have flask-backend ask worker.corp to do some work and I’ll return the response as well as the address Route 53 returned for worker. Something like the code below:

@app.route("/") namespace = os.getenv("namespace") worker_host = "worker" + namespace def backend(): r = requests.get("http://"+worker_host) worker = socket.gethostbyname(worker_host) return "Worker Message: {]\nFrom: {}".format(r.content, worker)

 

Now, with my containers and task definitions in place, I’ll create a service in the console.

As I move to step two in the service wizard I’ll fill out the service discovery section and have ECS create a new namespace for me.

I’ll also tell ECS to monitor the health of the tasks in my service and add or remove them from Route 53 as needed. Then I’ll set a TTL of 10 seconds on the A records we’ll use.

I’ll repeat those same steps for my “worker” service and after a minute or so most of my tasks should be up and running.

Over in the Route 53 console I can see all the records for my tasks!

We can use the Route 53 service discovery APIs to list all of our available services and tasks and programmatically reach out to each one. We could easily extend to any number of services past just backend and worker. I’ve created a simple demo of an imaginary social network with services like “auth”, “feed”, “timeline”, “worker”, “user” and more here: https://servicediscovery.ranman.com/. You can see the code used to run that page on github.

Available Now
Amazon ECS service discovery is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). AWS Fargate is currently only available in US East (N. Virginia). When you use ECS service discovery, you pay for the Route 53 resources that you consume, including each namespace that you create, and for the lookup queries your services make. Container level health checks are provided at no cost. For more information on pricing check out the documentation.

Please let us know what you’ll be building or refactoring with service discovery either in the comments or on Twitter!

Randall

 

P.S. Every blog post I write is made with a tremendous amount of help from numerous AWS colleagues. To everyone that helped build service discovery across all of our teams – thank you :)!

Categories: Cloud

Thunder, the Drupal 8 Distribution for Professional Publishing

Drupal News - Tue, 03/27/2018 - 16:41
https://thunder.org/

Thunder is the Drupal 8 distribution for professional publishing. Thunder was designed by Hubert Burda Media and released as open-source software under the GNU General Public License in 2016. As members of the Thunder community, publishers, partners, and developers build custom extensions and share them with the community to further enhance Thunder.

Thunder consists of the current Drupal 8 functionality, lots of handpicked publisher-centric modules with custom enhancements (our own Thunder Admin Theme, the Paragraphs module, the Media Entity module, the Entity Browser module, and lots more), and an environment which makes it easy to install, deploy and add new functionality (e.g. the Thunder Updater).

To learn more about Thunder projects, read these case studies: German magazine Mein Schöner Garten (Gardening Magazine for Hubert Burda Media), US magazine American Heritage (American Heritage Magazine Migration – Drupal 8), and Serbian television and radio station PannonRTV (News portal for media house – PannonRTV).

About the idea:

We at the Thunder Core Team believe that publishers do not compete with each other through technology, but rather through content and brands. That is why the German publisher Hubert Burda Media established the Thunder community which aims to join forces among media companies by sharing code and innovation power. The goal is to innovate faster and spend less money overall by working together.

The Thunder community’s core product is the open-source content management system Thunder. Community members develop useful modules, use them for their own purposes and share them with the community by publishing them under the GNU General Public License. Neither Hubert Burda Media nor the other publishers in the community charge anyone for their contributions.

Any company publishing content professionally is welcome as a member of the Thunder community - both as user and as contributor. Anyone can join by contributing to the distribution. The usefulness and richness of Thunder’s functionality directly benefit from the number of contributors.

Why Drupal was chosen: 

For Burda, Drupal is the content management platform of choice. It is a free and open-source content-management framework written in PHP and distributed under the GNU General Public License.

The standard Drupal core already provides the essential features, e.g. user management, menu management, RSS feeds, taxonomy, page layout customization, and system administration. It is easily adaptable and extensible with thousands of modules provided by a global community of users and developers. In addition, developers at Hubert Burda Media have had previous good experiences with Drupal. Drupal is therefore a tried and tested basis and has become even better with Drupal 8.

Describe the project (goals, requirements and outcome): 

Thunder started as a way to share innovation and synergies among the many different brands and products within the Burda Corporation to save costs and speed up the time to market. It did not take long until we realized that the model that worked within the very diverse Burda universe would be useful for almost all digital publishers. That was when we decided to open source the distribution.

Due to its open source basis on Drupal 8, all features and functionality within Thunder are available to anyone wishing to benefit from Burda’s industry experience. Individual brands can add modules to tailor the system to their specific needs. Many of those “specific” customizations will prove to be valuable to more than just the organizations they originated from. We therefore designed Thunder in a way that we can easily incorporate those add-ons into the main distribution and share the features among all brands.

Goals:

We aim at becoming the best open-source content management system for professional publishing. In this, we focus on the creation of content. We want to help editors to create articles, to add media, to build landing pages, in short, to share their stories with the world.
We want Thunder to be a CMS jointly developed by its users and are therefore working towards building a community of publishers, IT agencies, and anyone else who shares our ideas and contributes to Thunder.

Our aim in doing so is to stay very close to the Drupal community and the Drupal core instead of creating a Thunder fork. Whenever we want to implement a new functionality or solve a problem, we try to do this in Drupal core or in the modules Thunder uses instead of fixing things in the distribution.

Time spent:

It’s difficult to measure the time spent on the development of Thunder, as this is an ongoing process. Currently, there are four developers employed by Hubert Burda Media working on the distribution full-time, plus several external developers. They focus on the advancement of Thunder as well as Drupal core and the contrib modules used in the distribution. A community manager is working on coordinating and growing the Thunder community of publishers, developers, and other partners.

Timeline and Milestones:
  • 30th August 2015: Repository and first commits for Thunder
  • September 2015: playboy.de – the first website running on Thunder
  • November 2015: instyle.de – the second website running on Thunder as well as proof of concept of the sharing model
  • 17th March 2016: Official press release about Thunder
  • October 2016: produceretailer.com is the first professional non-Burda website running on Thunder
  • 30th January 2017: Release of Thunder 1.0
  • March 2016: One year after the official launch of the Thunder initiative, 15 websites (we know of) are running on Thunder.
  • 1st June 2017: Release of Thunder 2.0
  • 20th July 2017: Release of Thunder Admin Theme
  • 20th November 2017: First community event, the Thunder Day in Hamburg
Results:

We released Thunder 1.0 in January 2017. One year later, at least 60 professional websites that we know of now run on Thunder. In the meantime, we have also released Thunder 2.0 and the Thunder Admin Theme.

Publishing houses grabbed the idea of working together. The Austrian publisher kurier.at, for example, contributed to the liveblog module used in Thunder and developed a new functionality to split text paragraphs.

In community matters, we talked to more than 300 companies worldwide. We established the “Certified Thunder Integrator” program to help publishers to find IT agencies as well as IT agencies to find customers. As of now, there are more than 20 companies certified or in the certification process.

We aim at bringing people together to share experiences. For this purpose, we introduced a Slack team for the Thunder community as well as several social media accounts. Furthermore, we organized the first community event – the Thunder Day – with around 120 participants in November 2017.

Challenges and how we resolved them:

Updating:

Distributions such as Thunder face the problem of losing control after the installation. How should a distribution actually deliver features and updates? We thought a lot about this problem and introduced the Thunder Updater, the “Thunder way to keep your site up to date”. Thunder checks if installed configurations have been changed – if not, they can be updated. Otherwise, you will get a message telling you there’s an update pending and what to do if you wish to have it. This functionality is currently an integral part of the distribution but we plan to detach it and publish it as a module on drupal.org soon so that everybody can use it.

Testing:

Writing an Admin Theme is very difficult because Drupal offers so many possibilities to adapt things: If you change something it can have unexpected effects in unexpected places. To avoid surprises, we developed Sharpeye, a visual regression tool. It takes screenshots and compares them in automated tests. This gives us a good overview. We open sourced the tool and you can download it here: github.com/BurdaMagazinOrg/sharpeye

Technical details, tips, and tricks: Tooling:

We invested a lot of time into automated testing but it was well worth the effort, not only for Thunder but also for Drupal core and the contrib modules we use since we discovered a lot of bugs there too.

Development process:

We don’t use a closed issue tracker but publish our tickets on drupal.org, thereby creating transparency. We use Github rather than drupal.org for the development because the developer experience is much better.

Organizations involved: 

Thunder

Modules/Themes/Distributions

Key modules/theme/distribution used: 

Why these modules/theme/distribution were chosen:  Requirements / Key modules Storytelling

In professional publishing, it’s all about the story. It has to be easy to create a story, to extend it, to change its narrative strand, and to enrich it with multimedia content. We use the Paragraphs module for this. Instead of putting all their content in one WYSIWYG body field including images and videos, end-users can now choose on the fly between pre-defined Paragraph Types independent from one another. Paragraph Types can be anything you want from a simple text block or image to a complex and configurable slideshow. This allows editors to structure an article into sub-elements which can easily be created, edited, and reorganized.

Media Handling

Editors want to enrich their articles with pictures, videos, content from social media, and whatever else you might think of. Paragraphs are one part of this, the other is the combination of the Media Entity module and the Entity Browser module. With those modules, editors can easily upload new content but also find and reuse existing entities.

SEO

Search engine optimization plays a major role in every editor’s life. Thunder therefore gas a plethora of different adjusting screws, from several meta tags for Facebook, Twitter, and Open Graph up to the simple XML sitemap.

Scheduled Publishing

The editor’s daily life is a lot about planning. With Thunder, you can schedule articles, ensuring they will be published at a given date and time. Even more importantly, you can also schedule the time at which an article or a picture should not be shown on the website anymore, e.g. if the contract period for a photograph has ended or an event announcement isn’t useful anymore.

Improved Authoring Experience

Our primary focus is making the editors’ work with Thunder as easy as possible. In order to achieve this, we created the Thunder Admin Theme based on findings of user tests and a survey conducted with editors working with Thunder.

Detailed Module List

Find a detailed list of the modules we use in Thunder here: burdamagazinorg.github.io/thunder-documentation/modules

Community contributions: 

Since we get a lot from the Drupal community, we give our best to contribute back, e.g. by fixing the bugs we find through automated tests and by supporting Drupal events and code sprints with developer time, talks, and sponsoring. Christian Fritsch, a member of the Thunder Core Team, contributed a lot of his time to the media initiative. Ingo Rübe, the initiator of Thunder, is a member of the Drupal Association’s Board of Directors.

Project team: 
  • Daniel Bosen - Lead Developer
  • Christian Fritsch - Senior Developer
  • Mladen Todorovic - Senior Developer
  • Volker Killesreiter - Senior Developer
  • Julia Pradel - Community Manager
  • Ingo Rübe - Initiator of Thunder
  • Collin Müller - Head of Strategic Development
Team members:  Thunder is proud sponsor of the Media and Publishing Summit ahead of the DrupalCon in Nashville. Meet us on 9th April to learn more about Thunder and how it is used in professional publishing.
Categories: Drupal

New – Amazon DynamoDB Continuous Backups and Point-In-Time Recovery (PITR)

AWS Blog - Mon, 03/26/2018 - 17:13

The Amazon DynamoDB team is back with another useful feature hot on the heels of encryption at rest. At AWS re:Invent 2017 we launched global tables and on-demand backup and restore of your DynamoDB tables and today we’re launching continuous backups with point-in-time recovery (PITR).

You can enable continuous backups with a single click in the AWS Management Console, a simple API call, or with the AWS Command Line Interface (CLI). DynamoDB can back up your data with per-second granularity and restore to any single second from the time PITR was enabled up to the prior 35 days. We built this feature to protect against accidental writes or deletes. If a developer runs a script against production instead of staging or if someone fat-fingers a DeleteItem call, PITR has you covered. We also built it for the scenarios you can’t normally predict. You can still keep your on-demand backups for as long as needed for archival purposes but PITR works as additional insurance against accidental loss of data. Let’s see how this works.

Continuous Backup

To enable this feature in the console we navigate to our table and select the Backups tab. From there simply click Enable to turn on the feature. I could also turn on continuous backups via the UpdateContinuousBackups API call.

After continuous backup is enabled we should be able to see an Earliest restore date and Latest restore date

Let’s imagine a scenario where I have a lot of old user profiles that I want to delete.

I really only want to send service updates to our active users based on their last_update date. I decided to write a quick Python script to delete all the users that haven’t used my service in a while.

import boto3 table = boto3.resource("dynamodb").Table("VerySuperImportantTable") items = table.scan( FilterExpression="last_update >= :date", ExpressionAttributeValues={":date": "2014-01-01T00:00:00"}, ProjectionExpression="ImportantId" )['Items'] print("Deleting {} Items! Dangerous.".format(len(items))) with table.batch_writer() as batch: for item in items: batch.delete_item(Key=item)

Great! This should delete all those pesky non-users of my service that haven’t logged in since 2013. So,— CTRL+C CTRL+C CTRL+C CTRL+C (interrupt the currently executing command).

Yikes! Do you see where I went wrong? I’ve just deleted my most important users! Oh, no! Where I had a greater-than sign, I meant to put a less-than! Quick, before Jeff Barr can see, I’m going to restore the table. (I probably could have prevented that typo with Boto 3’s handy DynamoDB conditions: Attr("last_update").lt("2014-01-01T00:00:00"))

Restoring

Luckily for me, restoring a table is easy. In the console I’ll navigate to the Backups tab for my table and click Restore to point-in-time.

I’ll specify the time (a few seconds before I started my deleting spree) and a name for the table I’m restoring to.

For a relatively small and evenly distributed table like mine, the restore is quite fast.

The time it takes to restore a table varies based on multiple factors and restore times are not neccesarily coordinated with the size of the table. If your dataset is evenly distributed across your primary keys you’ll be able to take advanatage of parallelization which will speed up your restores.

Learn More & Try It Yourself
There’s plenty more to learn about this new feature in the documentation here.

Pricing for continuous backups varies by region and is based on the current size of the table and all indexes.

A few things to note:

  • PITR works with encrypted tables.
  • If you disable PITR and later reenable it, you reset the start time from which you can recover.
  • Just like on-demand backups, there are no performance or availability impacts to enabling this feature.
  • Stream settings, Time To Live settings, PITR settings, tags, Amazon CloudWatch alarms, and auto scaling policies are not copied to the restored table.
  • Jeff, it turns out, knew I restored the table all along because every PITR API call is recorded in AWS CloudTrail.

Let us know how you’re going to use continuous backups and PITR on Twitter and in the comments.
Randall

Categories: Cloud

March Machine Learning Madness!

AWS Blog - Mon, 03/26/2018 - 10:01

Mid-march in the USA means millions of people watching, and betting on, college basketball (I live here but I didn’t make the rules). As the NCAA college championship continues I wanted to briefly highlight the work of Wesley Pasfield one of our Professional Services Machine Learning Specialists. Wesley was able to take data from kenpom.com and College Basketball Reference to build a model predicting the outcome of March Madness using the Amazon SageMaker built-in XGBoost algorithm.

Wesley walks us through grabbing the data, performing an exploratory data analysis (EDA in the data science lingo), reshaping the data for the xgboost algorithm, using the SageMaker SDK to create a training job for two different models, and finally creating a SageMaker inference endpoint for serving predictions at https://cbbpredictions.com/. Check out part one of the post and part two.

Pretty cool right? Why not open the notebook and give the xgboost algorithm a try? Just know that there are a few caveats to the predictions so don’t go making your champion prediction just yet!

Randall

Categories: Cloud

Pages

Subscribe to LAMP, Database and Cloud Technical Information aggregator


Main menu 2

by Dr. Radut