background image

Amazon Redshift

Management Guide

API Version 2012-12-01

Summary of Contents for Redshift

Page 1: ...Amazon Redshift Management Guide API Version 2012 12 01...

Page 2: ...Amazon Web Services Amazon Redshift Management Guide...

Page 3: ...azon Route 53 Amazon S3 Amazon VPC In addition Amazon com graphics logos page headers button icons scripts and service names are trademarks or trade dress of Amazon in the U S and or other countries A...

Page 4: ...sing the Console 63 Viewing Events Using AWS SDK for Java 65 View Events Using Amazon Redshift CLI and API 66 Monitoring Cluster Performance 67 Summary of Performance Data 68 Working with Performance...

Page 5: ...dless of the size of the data set Amazon Redshift offers high query performance Amazon Redshift manages all the work of setting up operating and scaling a data warehouse provisioning capacity monitori...

Page 6: ...see Monitoring Amazon Redshift Cluster Performance p 67 Control access to your Amazon Redshift resources The AWS account that creates the cluster has full access to the cluster Within your AWS accoun...

Page 7: ...tabase developer the Amazon Redshift Database Developer Guide explains how to design build query and maintain the databases that make up your data warehouse API Version 2012 12 01 3 Amazon Redshift Ma...

Page 8: ...r way involves minimal downtime Resizing replaces old cluster at the end of the resize operation When you submit a resize request the source cluster remains in read only mode until the resize operatio...

Page 9: ...32 nodes 8XL clusters can contain 2 to 100 nodes For pricing information and detailed information about individual node types go the Amazon Redshift product detail page The number of nodes you choose...

Page 10: ...switch you will notice the connection loss You must restart the query on the new cluster In a modify operation you can resize your cluster as necessary For example you can add more nodes change the no...

Page 11: ...by step instructions go to Supported Platforms in the Amazon Elastic Compute Cloud User Guide 3 If your account supports both EC2 platforms choose the platform on which you want to deploy the Amazon...

Page 12: ...If no clusters are configured for your account click Launch Cluster b If one or more clusters are defined for your account in the navigation pane click Clusters Under Launch an Amazon Redshift Cluster...

Page 13: ...database instance d In the Master User Name box specify an account name for the master user of the database Master user names must meet the following conditions Must contain from 1 to 128 alphanumeri...

Page 14: ...the Cluster Type box click a cluster type c In the Number of Compute Nodes box specify the number of nodes that the cluster will contain If you select Cluster Type as Single Node the number of comput...

Page 15: ...ity Zone box click the availability zone where you want to provision your cluster c In the Cluster Security Groups box click a security group to associate with the cluster Provision the cluster outsid...

Page 16: ...he cluster If everything is satisfactory click Launch Cluster to start the creation process Otherwise click Back to make any necessary modification and then click Continue to return to the Review page...

Page 17: ...retention period All of these changes take effect only after the cluster is restarted If you decrease the automated snapshot retention period existing automated snapshots that fall outside of the new...

Page 18: ...associate multiple security groups with a cluster you can select multiple items from the Cluster Security Groups list In the following example a cluster provisioned in a VPC is modified so that two V...

Page 19: ...mation about cluster snapshots see Amazon Redshift Snapshots p 52 To delete a cluster 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshi...

Page 20: ...ing a Cluster When you reboot a cluster the cluster status is set to rebooting and a cluster event is created when the reboot is completed Any pending cluster modifications are applied at this reboot...

Page 21: ...ou should wait until the resize is complete to resume working with the cluster To resize a cluster 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amaz...

Page 22: ...dshift console at https console aws amazon com redshift 2 In the navigation pane click Clusters 3 Under Performance Status click the magnifying glass icon beside the cluster identifier The cluster sum...

Page 23: ...us 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the navigation pane click Clusters 3 Under Performance Status click the mag...

Page 24: ...snapshot of a cluster 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the navigation pane click Clusters 3 Under Performance S...

Page 25: ...After you initiate the request for the cluster to be created you must wait until the cluster is in the available state before you can modify it This example uses a loop to periodically check the statu...

Page 26: ...eateResponse client createCluster request System out println Created cluster createResponse getClusterIdenti fier private static void describeClusters DescribeClustersRequest request new DescribeClust...

Page 27: ...status equalsIgnoreCase available clusterReady true else System out print Thread sleep sleepTime 1000 Manage Clusters Using Amazon Redshift CLI and API You can use the following Amazon Redshift CLI o...

Page 28: ...nformation see Creating a Cluster in a VPC p 24 Restoring a Snapshot of a Cluster in VPC A snapshot of a cluster in VPC can only be restored in a VPC not outside the VPC You can restore it in the same...

Page 29: ...C on the public Internet to connect to the cluster you must modify the VPC security group Specifically you should allow the port that you used when you launched the cluster For examples of security gr...

Page 30: ...bnet Group p 28 The section explains how to manage your cluster subnet groups using the Amazon Redshift console You can create a cluster subnet group manage an existing one or delete one All of these...

Page 31: ...ne subnet group 5 Click Yes Create The new group will be displayed in the list of cluster subnet groups Modifying a Cluster Subnet Group To modify a cluster subnet group 1 Sign into the AWS Management...

Page 32: ...from the group Deleting a Cluster Subnet Group You cannot delete a cluster subnet group that is used by a cluster To delete a cluster subnet group 1 Sign into the AWS Management Console and open the A...

Page 33: ...rrayList import java util List import com amazonaws auth AWSCredentials import com amazonaws auth PropertiesCredentials import com amazonaws services redshift AmazonRedshiftClient import com amazonaws...

Page 34: ...ClusterSubnetGroupsResult result1 client describeClusterSub netGroups request1 List String subnetNames new ArrayList String We can work with just the first group returned since we requested info about...

Page 35: ...subnet group getSubnets System out format Subnet s s s n subnet getSubnetIdenti fier subnet getSubnetAvailabilityZone getName subnet getSub netStatus Manage Cluster Subnet Groups Using Amazon Redshif...

Page 36: ...cluster or later Also you can associate a security group with multiple clusters Amazon Redshift provides a security group called default which is created automatically when you launch your first clust...

Page 37: ...Cloud VPC p 23 Managing Security Groups Using the Console Topics Creating a Security Group p 34 Managing Ingress Rules for a Security Group p 34 Deleting a Security Group p 36 Associating a Security...

Page 38: ...he new group will be displayed in the list of security groups Managing Ingress Rules for a Security Group To manage ingress rules for a security group 1 Sign into the AWS Management Console and open t...

Page 39: ...ct the account to use By default the account currently logged into the console is used In this case you need click the name of the EC2 security group you want in the EC2 Security Group Name box If you...

Page 40: ...m redshift 2 In the navigation pane click Security Groups 3 Click the row for the security group that you want to delete and then click Delete One row must be selected for the Delete button to be enab...

Page 41: ...fying an owner ID and Amazon EC2 security group combination For step by step instructions to run the following example see Running Java Examples for Amazon Redshift Using Eclipse p 118 You need to upd...

Page 42: ...yGroup result client authorizeClusterSecurityGroupIn gress request request new AuthorizeClusterSecurityGroupIngressRequest withClusterSecurityGroupName clusterSecurityGroupName withEC2SecurityGroupNam...

Page 43: ...sult getClusterSecurityGroups private static void printResultSecurityGroups List ClusterSecurityGroup groups if groups null System out println nDescribe cluster security groups result is null return S...

Page 44: ...curity group describe cluster security groups revoke cluster security group ingress You can use the following Amazon Redshift APIs to manage security groups AuthorizeClusterSecurityGroupIngress Create...

Page 45: ...ter group is named default redshift version For example default redshift 1 0 The list of parameters available in a parameter group depends on the parameter group family to which it belongs A parameter...

Page 46: ...parameter type changes to user to indicate that the value has been modified For parameter groups that you create you can modify a parameter value at any time or you can reset all parameter values to t...

Page 47: ...Parameter Group p 44 Modifying a Parameter Group p 44 Deleting a Parameter Group p 46 Associating a Parameter Group with a Cluster p 47 You can create modify and delete parameter groups by using the...

Page 48: ...Create Cluster Parameter Group dialog box specify a parameter group family a parameter group name and a parameter group description When the settings are as you want them click Yes Create Modifying a...

Page 49: ...hen the settings are as you want them click Save Changes 5 To edit the workload management configuration in the parameter group list click the row of the parameter group you want to modify and then cl...

Page 50: ...ue in the list Click the X button of the queue to remove Delete a queue b When the settings are as you want them click Save Changes Deleting a Parameter Group If a parameter group is associated with o...

Page 51: ...ollowing tasks related to parameter groups Creating a parameter group Modifying a parameter group Associating a parameter group with a cluster Getting information about parameter groups This example c...

Page 52: ...n the following example see Running Java Examples for Amazon Redshift Using Eclipse p 118 You need to update the code and provide a cluster identifier import java io IOException import java util Array...

Page 53: ...eClusterPara meterGroups printResultClusterParameterGroups result private static void modifyClusterParameterGroup List Parameter parameters new ArrayList Parameter parameters add new Parameter withPar...

Page 54: ...Name DescribeClusterParametersRequest request new DescribeClusterParamet ersRequest withParameterGroupName parameterGroupName DescribeClusterParametersResult result client describeClusterParamet ers r...

Page 55: ...groups describe default cluster parameters modify cluster parameter group reset cluster parameter group You can use the following Amazon Redshift APIs to manage parameter groups CreateClusterParamete...

Page 56: ...f you set the retention period to zero automated snapshots are disabled in which case Amazon Redshift stops taking snapshots and deletes any existing automated snapshots for the cluster Only Amazon Re...

Page 57: ...h any other security or parameter group you will need to manually associate those groups with the new cluster Copying an Automated Snapshot Amazon Redshift deletes an automated snapshot at the end of...

Page 58: ...filter the list by using the snapshot type a time range and associated cluster When you select an existing snapshot the snapshot details are shown in line in the list as shown in the example below Dep...

Page 59: ...of being created Deleting a Manual Snapshot To delete a manual snapshot 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the na...

Page 60: ...his copy as long as you want To copy an automated snapshot 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the navigation pane...

Page 61: ...https console aws amazon com redshift 2 In the navigation pane click Snapshots 3 If you need to filter the list in order to find the snapshot that you want to delete do any or all of the following In...

Page 62: ...ge the value as appropriate c Select Allow Version Upgrade as appropriate d In the Availability Zone box accept the availability zone from the snapshot or change the value as appropriate 7 Click Resto...

Page 63: ...ster Displaying information about all the snapshots of a cluster Deleting manual snapshots of a cluster In this example a snapshot of the cluster is initiated When the snapshot is successfully created...

Page 64: ...pshotId describeSnapshots deleteManualSnapshotsBefore createDate describeSnapshots catch Exception e System err println Operation failed e getMessage private static Date createManualSnapshot String sn...

Page 65: ...ts System out format Identifier s n snapshot getSnapshotIdentifi er System out format Snapshot type s n snapshot getSnapshotType System out format Snapshot create time s n snapshot getSnap shotCreateT...

Page 66: ...er snapshot revoke snapshot access You can use the following Amazon Redshift API actions to manage snapshots AuthorizeSnapshotAccess CopyClusterSnapshot CreateClusterSnapshot DeleteClusterSnapshot Des...

Page 67: ...or the AWS SDKs to obtain event information You can obtain a list of all events or you can apply filters such as event duration or start and end date to obtain events information for a specific period...

Page 68: ...ts displayed To filter events 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the navigation pane click Events 3 To filter eve...

Page 69: ...fier public static String eventSourceType provide source type e g cluster snapshot public static void main String args throws IOException AWSCredentials credentials new PropertiesCredentials ListEvent...

Page 70: ...Type s n event getSourceType System out format Message s n event getMessage System out format Date s n event getDate View Events Using Amazon Redshift CLI and API You an use the following Amazon Reds...

Page 71: ...h Command Line Interface CLI or one of the AWS Software Development Kits SDKs Query Load Performance Data Performance data helps you monitor database activity and performance This data is aggregated i...

Page 72: ...it that tell you what the metric is applicable to that is the scope of the metric Amazon Redshift has the following two dimensions Metrics that have a NodeID dimension are metrics that provide perfor...

Page 73: ...ignored and only HEALTHY or UNHEALTHY are displayed In Amazon CloudWatch values different than 1 and 0 may occur because of sampling issue Any value below 1 for HealthStatus is reported as 0 UNHEALTH...

Page 74: ...formance data can be used to help you understand the relation between database performance and cluster metrics For example if you notice that a cluster s CPU spiked you can find the spike on the clust...

Page 75: ...h Metrics p 68 Load Detail Working with Performance Data in the Amazon Redshift Console This section explains how to view performance data in the Amazon Redshift console which includes information abo...

Page 76: ...diagnosing why the query was slow and what can be done to improve its performance The default cluster view shows all nodes graphed together an Average statistic and data for the last hour You can cha...

Page 77: ...ed for metrics Select a metrics from the Metrics drop down By default all metrics are shown Change what metrics are shown all or a specific metric Click the Nodes button By default node data for a giv...

Page 78: ...ReceiveThroughput value is displayed during the query execution Viewing Cluster Metrics During Query Execution You can view cluster metrics during query execution to help identify poorly performing qu...

Page 79: ...rics scroll to the bottom of the query details For more examples of the Query tab see Cluster Metrics During Query Execution Examples p 76 In the following example the CPUUtilization and NetworkReceiv...

Page 80: ...owing example the Query Properties Details SQL and Loaded Files information for the query are displayed In the following example an Explain Plan is shown API Version 2012 12 01 76 Amazon Redshift Mana...

Page 81: ...s during load operations 1 Sign into the AWS Management Console and open the Amazon Redshift console at https console aws amazon com redshift 2 In the left navigation click Clusters 3 In the cluster l...

Page 82: ...ill help indicate when the node is over utilized Likewise setting an alarm for low CPUUtilization on a cluster node will help indicate when the node is underutilized This section explains how to creat...

Page 83: ...opic in the Create Alarm dialog by clicking the create topic link The details of your alarm will vary with your circumstance In the following example the average CPU utilization of a node Compute 0 ha...

Page 84: ...age space of a node When working with Amazon Redshift metrics in the Amazon CloudWatch console command line tools or an Amazon SDK there are two concepts to keep in mind First you specify the metric d...

Page 85: ...tions In general metrics in for this dimension e g ReadIOPS that are also metrics of nodes represent an aggregate of the node metric data You should take care in interpreting these metrics because the...

Page 86: ...or each compute node that you want to reserve When you reserve a compute node you pay a fixed up front charge and then an hourly recurring charge whether your cluster is running or not The hourly char...

Page 87: ...of those nodes at the reserved node rate Also any similar nodes that you run in excess of the number of nodes you reserved will be billed at the normal rate Let s look at the following examples Suppo...

Page 88: ...llowing Type a full or partial reservation ID The displayed items will be only offers that include the string you type Click All Purchases Active or Inactive Purchasing Reserved Node Offerings To purc...

Page 89: ...Continue 6 Review the offering details and then click Purchase 7 In the reserved nodes list check the Status of your offering When the status changes to active you can use your node At that time the...

Page 90: ...va io IOException import java util ArrayList import com amazonaws auth AWSCredentials import com amazonaws auth PropertiesCredentials import com amazonaws services redshift AmazonRedshiftClient import...

Page 91: ...ervedNodes System out println Listing nodes already purchased for ReservedNode node result getReservedNodes printReservedNodeDetails node private static void findReservedNodeOffer DescribeReservedNode...

Page 92: ...ringNotFoundException ex2 catch ReservedNodeQuotaExceededException ex3 catch Exception ex4 System out println Finished private static void printOfferingDetails ReservedNodeOffering offering System out...

Page 93: ...purchase reserved node offerings purchase reserved node offering describe reserved node offerings describe orderable cluster options You can use the following Amazon Redshift APIs to purchase reserved...

Page 94: ...wn credentials and only the permissions that you explicitly grant so that you do not have to share your account credentials with other users By default IAM users have no permissions you must grant per...

Page 95: ...rSecurityGroup CreateClusterSnapshot redshift CreateClusterSnapshot CreateClusterSubnetGroup redshift CreateClusterSubnetGroup DeleteCluster redshift DeleteCluster DeleteClusterParameterGroup redshift...

Page 96: ...redshift ResetClusterParameterGroup RestoreFromClusterSnapshot redshift RestoreFromClusterSnapshot RevokeClusterSecurityGroupIngress redshift RevokeClusterSecurityGroupIngress The following section pr...

Page 97: ...Redshift using the console you might need to grant additional access permissions as shown in the following example 2 in which you grant access to Cloudwatch metrics the console provides By default all...

Page 98: ...rmissions of their own access to Amazon Redshift actions does not cover them The AWS IAM console provides the following predefined Amazon Redshift policy templates that you can use to allow the necess...

Page 99: ...manage your client s connectivity to your clusters including working with Secure Sockets Layers SSL and server certificates dealing with firewall timeout issues and using the SQL Workbench J to connec...

Page 100: ...g SQL Workbench J Follow the instructions for installing SQL Workbench J on your system Note SQL Workbench J requires the Java Runtime Environment JRE be installed on your system Ensure you are using...

Page 101: ...avigation pane 3 In the Driver box click PostgreSQL org postgresql Driver 4 When a message appears saying that the library for the driver is not available click Yes 5 Click the browse icon next to the...

Page 102: ...from the Amazon Redshift Getting Started Guide you can test your connection by typing the following query into the Statement window select from users order by userid limit 100 If your connection is s...

Page 103: ...e b In the Amazon Redshift console click the name of your cluster and write down or copy the ODBC URL from the Configuration pane You will use portions of the URL string in step e c In the Control Pan...

Page 104: ...box type a description for your reference In the SSL Mode box click the mode that you want For best results we recommend the prefer setting In the Server box type the server string from the ODBC URL y...

Page 105: ...ype a name for your Connection Profile that will appear in the navigation pane 3 In the Driver box click ODBC Bridge sun jdbc odbc JdbcOdbcDriver 4 In the URL box replace datasource_name_from_control_...

Page 106: ...shift Getting Started Guide you can test your connection by typing the following query into the Statement window select from users order by userid limit 100 If your connection is successful a listing...

Page 107: ...stalling the PostgreSQL client tools select your operating system from the PostrgreSQL binary downloads page at http www postgresql org download Connect by Using the psql Defaults By default psql does...

Page 108: ...operates with sslmode prefer To specify that psql opens an encrypted connection and uses an Amazon Redshift certificate to verify the service download an Amazon Redshift certificate to your computer...

Page 109: ...and your cluster it should not be relied on for authenticating the server To authenticate the server install the public key pem file for the SSL certificate on your client and use the key to connect...

Page 110: ...keystore_name alias alias file certificate_filename Where keystore_name is the keystore that your client application uses 2 Specify the following properties in your connection string property name ss...

Page 111: ...the cluster e g 5439 DBNAME is the name of the database you specified when you created the cluster e g mydb In Java code you can specify the connection string as follows Connection conn null Propertie...

Page 112: ...can refer to the keystore by specifying a property when you run your code as follows Djavax net ssl trustStore path to keystore Djavax net ssl trustStorePassword keystore password API Version 2012 12...

Page 113: ...me master user name static final String MasterUserPassword master user password public static void main String args Connection conn null Statement stmt null try Dynamically load postgresql driver at r...

Page 114: ...em out println Finished connectivity test Connecting to a Cluster by Using NET When you use NET C to programmatically connect to your cluster you can do so with or without server authentication If you...

Page 115: ...on string Port e g 5439 String Port provide port MasterUserName e g masteruser Note use of single quotes String MasterUsername provide master user name MasterUserPassword e g mypassword Note use of si...

Page 116: ...uter you are using to connect to your cluster The timeout period should be adjusted for your client and network See the To change TCP IP timeout settings p 112 Set the TCP IP keep alive behavior for y...

Page 117: ...IP keep alive behavior in your connection string as explained in the following procedure To specify the TCP IP keep alive setting in a connection string In the JDBC connection string that you specify...

Page 118: ...limits on the number of nodes that you can allocate provision for a cluster even if you request a quota increase These limits are different for different node types For a list of the current limits s...

Page 119: ...me Must be between 8 and 64 characters in length Must contain at least one uppercase letter Must contain at least one lowercase letter Must contain one number Can be any printable ASCII character ASCI...

Page 120: ...hanumeric characters or hyphens Must not be Default Must be unique for all security groups that are created by your AWS account Cluster snapshot identifier API Version 2012 12 01 116 Amazon Redshift M...

Page 121: ...the libraries compute the necessary signature and include it in the request to Amazon Redshift Note Amazon Redshift is supported by the AWS SDK for Java NET PHP Ruby and Node js In this guide example...

Page 122: ...sterUsername masteruser withMasterUserPassword 12345678Aa withNodeType dw hs1 xlarge withNumberOfNodes 2 Cluster createResponse client createCluster request System out println Created cluster createRe...

Page 123: ...k lib sdk third party commons logging 1 1 1 sdk third party httpcomponents client 4 1 1 sdk third party jackson core 1 8 CreateAndModifyClusterSecurityGroup Change the class path separator as needed f...

Page 124: ...mazon S3 by using client side encryption a database COPY operation will decrypt the data as it is copied to Amazon Redshift The COPY command uses the master symmetric key you specified For more inform...

Page 125: ...zonaws services s3 AmazonS3 import com amazonaws services s3 AmazonS3EncryptionClient import com amazonaws services s3 model EncryptionMaterials import com amazonaws services s3 model PutObjectRequest...

Page 126: ...ory Skip sub directories int retryCount 0 boolean done false while done try uploadToS3 s3 bucketName s3Prefix fileEntry done true catch Exception e retryCount if retryCount MAX_RETRY_COUNT System out...

Page 127: ...Request ID ase getRequestId throw ase catch AmazonClientException ace System out println Caught an AmazonClientException which means the client encountered a serious internal problem while trying to...

Page 128: ...same hash function and input that you used to sign the request If the resulting signature matches the signature in the request Amazon Redshift processes the request otherwise the request is rejected...

Page 129: ...he general request syntax is https redshift us east 1 amazonaws com Action CreateCluster ClusterIdentifier examplecluster MasterUsername masteruser MasterUserPassword 12345678Aa NumberOfNode 2 NodeTyp...

Page 130: ...is as the key to hash the data us east 1 The result of this hash becomes the key for the next hash function After you calculate the derived key you use it in a hash function that accepts two input str...

Page 131: ...scribe cluster snapshot sharing and create snapshot progress metrics New Feature 08 July 2013 Published various documentation fixes Documentation fixes 22 April 2013 Updated the Amazon Redshift Manage...

Reviews: