AWS Glue vs InfoSphere Information Server

Last Updated:

Our analysts compared AWS Glue vs InfoSphere Information Server based on data from our 400+ point analysis of ETL Tools, user reviews and our own crowdsourced data from our free software selection platform.

Product Basics

AWS Glue is a fully managed, event-driven serverless computing platform that extracts, cleanses and organizes data for insights. Automatic code generation ensures citizen data scientists and power users can create and schedule integration workflows. An event-driven architecture enables setting triggers to launch data integration processes.

A common data catalog with automatic schema generation ensures data is unique and easily accessible. With streaming data integration, it catalogs assets from datastores like Amazon S3, making it available for querying with Amazon Athena and Redshift Spectrum. Developers can access readymade endpoints to edit and test code.

Pros
  • Serverless & Scalable
  • Easy Visual Workflow
  • Built-in Data Connectors
  • Pay-per-Use Pricing
  • AWS Ecosystem Integration
Cons
  • Complex Transformations
  • Limited On-Premise Data
  • Python & Scala Only
  • Potential Cost Overruns
  • AWS Lock-in Concerns
read more...
InfoSphere Information Server is a data integration powerhouse designed to unify information across complex, diverse systems. It excels at extracting, transforming, and loading data (ETL/ELT) for tasks like building data warehouses, powering analytics, and driving business insights. Best suited for large enterprises with demanding data needs and dedicated IT resources, InfoSphere boasts robust features like comprehensive data source/target connectors, powerful transformation tools, and advanced governance capabilities. User feedback highlights its scalability, security, and job automation as key benefits. However, its complexity and steep learning curve can be daunting for smaller setups. Additionally, the high licensing costs and resource-intensive nature might dissuade budget-conscious organizations. Compared to other data integration tools, InfoSphere leans towards high-volume, mission-critical scenarios, while alternative options might offer simpler setups or cater to broader use cases. Choosing the right fit depends on individual needs and priorities. Ultimately, InfoSphere Information Server shines when organizations need a robust, feature-rich platform to conquer complex data challenges, even at the cost of increased upfront investment and initial learning hurdles.

Pros
  • Powerful ETL & ELT capabilities
  • Wide range of data sources & targets
  • Job scheduling & monitoring
  • Data quality & transformation tools
  • Scalable & secure architecture
Cons
  • Steep learning curve & complexity
  • High licensing costs
  • Limited out-of-the-box connectors
  • Performance bottlenecks with large datasets
  • Resource-intensive for deployment & maintenance
read more...
$0.44/M-DPU-Hour
Free Trial is unavailable →
Get a free price quote
Tailored to your specific needs
$20,000 Annually
Free Trial is unavailable →
Get a free price quote
Tailored to your specific needs
Small 
i
Medium 
i
Large 
i
Small 
i
Medium 
i
Large 
i
Windows
Mac
Linux
Android
Chromebook
Windows
Mac
Linux
Android
Chromebook
Cloud
On-Premise
Mobile
Cloud
On-Premise
Mobile

Product Assistance

Documentation
In Person
Live Online
Videos
Webinars
Documentation
In Person
Live Online
Videos
Webinars
Email
Phone
Chat
FAQ
Forum
Knowledge Base
24/7 Live Support
Email
Phone
Chat
FAQ
Forum
Knowledge Base
24/7 Live Support

Product Insights

  • Effortless Data Integration: Streamline data movement across diverse sources like databases, applications, and cloud storage with pre-built connectors and automated schema discovery.
  • Simplified Data Preparation: Clean, transform, and enrich data with a visual drag-and-drop interface and built-in transformations, eliminating the need for complex coding.
  • Serverless Scalability: Forget infrastructure management! Glue seamlessly scales to handle massive data volumes without upfront provisioning or ongoing maintenance.
  • Cost-Effective Flexibility: Pay-per-use pricing based on actual resource consumption makes Glue ideal for both small and large data pipelines, optimizing your costs.
  • Seamless AWS Integration: Leverage the power of the AWS ecosystem! Glue effortlessly integrates with S3, Redshift, and other AWS services, creating a unified data pipeline within your existing infrastructure.
  • Improved Data Accessibility: Deliver prepared data to data lakes, data warehouses, and analytics platforms, democratizing access for data scientists, analysts, and business users.
  • Enhanced Collaboration: Share data pipelines and workflows with other users and teams, fostering collaboration and streamlining data-driven workflows.
  • Centralized Data Catalog: Maintain a single source of truth for your data assets with Glue Data Catalog, ensuring data consistency and discoverability.
  • Continuous Monitoring and Optimization: Track job performance, identify bottlenecks, and optimize your pipelines for efficiency with built-in monitoring and logging tools.
  • Future-Proof Data Infrastructure: Stay ahead of the curve with Glue's serverless architecture and cloud-native approach, adapting to your evolving data needs with ease.
read more...
  • Unified Data Landscape: Break down data silos and seamlessly integrate information from diverse sources, including databases, applications, and cloud platforms, providing a holistic view for improved decision-making.
  • Enhanced Data Quality: Ensure data accuracy and consistency through powerful cleansing, standardization, and transformation tools, minimizing errors and boosting trust in your data assets.
  • Streamlined Data Movement: Automate and orchestrate data movement using flexible ETL/ELT workflows, speeding up data delivery and reducing manual effort for critical business processes.
  • Powerful Data Transformation: Manipulate and enrich data to meet specific needs through a comprehensive set of transformation functions, unlocking its full potential for analytics and reporting.
  • Scalable and Secure Architecture: Handle large data volumes and complex integrations with confidence thanks to a robust and secure architecture, ensuring uninterrupted data flow and safeguarding sensitive information.
  • Improved Operational Efficiency: Automate routine data tasks, reduce IT workload, and optimize resource utilization, freeing up staff and resources for higher-value activities.
  • Enhanced Data Governance: Implement data governance policies and ensure compliance with regulations through features like lineage tracking and access controls, fostering data transparency and responsible usage.
  • Greater Business Agility: Respond rapidly to changing data needs and support new initiatives with agile data integration capabilities, empowering faster time-to-market and increased business flexibility.
  • Improved Collaboration and Decision-Making: Foster greater collaboration across teams by providing everyone with access to reliable and consistent data, leading to better informed decisions and data-driven strategies.
  • Reduced Integration Costs: Streamline data management processes, eliminate data redundancies, and optimize infrastructure usage by consolidating data integration needs into a single platform, potentially leading to cost savings.
read more...
  • Console: Discover, transform and make available data assets for querying and analysis. Builds complex data integration pipelines; handles dependencies, filters bad data and retries jobs after failures. Monitor jobs and get task status alerts via Amazon Cloudwatch. 
  • Data Catalog: Gleans and stores metadata in the catalog for workflow authoring, with full version history. Search and discover desired datasets from the data catalog, irrespective of where they are located. Saves time and money – automatically computes statistics and registers partitions with a central metadata repository. 
  • Automatic Schema Discovery: Creates metadata automatically by gleaning schema, quality and data types through built-in datastore crawlers and stores it in the Data Catalog. Ensure up-to-date assets – run crawlers on a schedule, on-demand or based on event triggers. Manage streaming data schemas with the Schema Registry. 
  • Event-driven Architecture: Move data automatically into data lakes and warehouses by setting triggers based on a schedule or event. Extract, transform and load jobs with a Lambda function as soon as new data becomes available. 
  • Visual Data Prep: Prepare assets for analytics and machine learning through Glue DataBrew. Automate anomaly filtering, convert data to standard formats and rectify invalid values with more than 250 pre-designed transformations – no need to write code. 
  • Materialized Views: Create a virtual table from multiple different data sources by using SQL. Copies data from each source data store and creates a replica in the target datastore as a materialized view. Ensures data is always up-to-date by monitoring data in source stores continuously and updating target stores in real time. 
read more...
  • Data Integration: Collect, transform and share large amounts of complex data assets across the organization. Reduce development time, and scale flexibly by leveraging built-in data transformation functions. Deliver data in real time to business applications in bulk, virtually or through change data capture (CDC). 
  • Data Quality: Cleanse and validate data — in batches and real time — then load it into analytical views for consistent monitoring. Establish data quality metrics across the organization by reusing these data views. Ensure consistent information organization-wide by linking related records across systems. 
    • Business Glossary: Create a single source of truth – consolidate disparate data into unique, reliable records and load into repositories and master data applications. Share insights with confidence powered by complete access to proof of lineage and data quality. A centralized hub maintains data governance rules. 
  • Information Governance Catalog: Empower data scientists and analysts to explore and analyze business data in compliance with enterprise governance standards. Create, manage and share a common business language, design and apply rules and track data lineage. Extend on-premise governance investment to the cloud by integrating with IBM Watson Knowledge Catalog. 
  • Metadata Repository: Share imported metadata and other assets in any server component across. the organization. Stores project configurations, reports and results for all the server’s components in one unified repository 
read more...

Product Ranking

#9

among all
ETL Tools

#32

among all
ETL Tools

Find out who the leaders are

Analyst Rating Summary

88
97
100
100
92
99
62
94
Show More Show More
Data Delivery
Performance and Scalability
Platform Capabilities
Platform Security
Workflow Management
Data Delivery
Metadata Management
Platform Security
Workflow Management
Data Quality

Analyst Ratings for Functional Requirements Customize This Data Customize This Data

AWS Glue
InfoSphere Information Server
+ Add Product + Add Product
Data Delivery Data Quality Data Sources And Targets Connectivity Data Transformation Metadata Management Platform Capabilities Workflow Management 100 92 62 90 96 100 100 100 99 94 95 100 0 100 0 25 50 75 100
100%
0%
0%
100%
0%
0%
85%
8%
7%
100%
0%
0%
36%
0%
64%
86%
14%
0%
88%
0%
12%
95%
0%
5%
90%
0%
10%
100%
0%
0%
100%
0%
0%
we're gathering data
N/A
we're gathering data
N/A
we're gathering data
N/A
100%
0%
0%
100%
0%
0%

Analyst Ratings for Technical Requirements Customize This Data Customize This Data

100%
0%
0%
we're gathering data
N/A
we're gathering data
N/A
we're gathering data
N/A
100%
0%
0%
100%
0%
0%

User Sentiment Summary

Great User Sentiment 165 reviews
Good User Sentiment 80 reviews
85%
of users recommend this product

AWS Glue has a 'great' User Satisfaction Rating of 85% when considering 165 user reviews from 3 recognized software review sites.

77%
of users recommend this product

InfoSphere Information Server has a 'good' User Satisfaction Rating of 77% when considering 80 user reviews from 3 recognized software review sites.

4.0 (46)
4.0 (21)
4.4 (109)
4.6 (27)
3.9 (10)
3.1 (32)

Awards

SelectHub research analysts have evaluated AWS Glue and concluded it earns best-in-class honors for Workflow Management.

Workflow Management Award

SelectHub research analysts have evaluated InfoSphere Information Server and concluded it earns best-in-class honors for Workflow Management.

Workflow Management Award

Synopsis of User Ratings and Reviews

Cost-Effective & Serverless: Pay only for resources used, eliminates server provisioning and maintenance
Simplified ETL workflows: Drag-and-drop UI & auto-generated code for easy job creation, even for non-programmers
Data Catalog: Unified metadata repository for seamless discovery & access across various data sources
Flexible Data Integration: Connects to diverse data sources & destinations (S3, Redshift, RDS, etc.)
Built-in Data Transformations: Apply pre-built & custom transformations within workflows for efficient data cleaning & shaping
Visual Data Cleaning (Glue DataBrew): Code-free data cleansing & normalization for analysts & data scientists
Scalability & Performance: Auto-scaling resources based on job needs, efficient Apache Spark engine for fast data processing
Community & Support: Active user community & helpful AWS support resources for problem-solving & best practices
Show more
Powerful Data Handling: Handles complex ETL/ELT processes and diverse data sources (relational, flat files, cloud platforms) with ease, streamlining data movement and integration.
Enhanced Data Quality: Ensures data accuracy and consistency through robust cleansing, validation, and transformation tools, boosting trust and reliability in data insights.
Scalability and Security: Supports large data volumes and complex integrations with confidence, safeguarding sensitive information through granular access controls and encryption.
Job Automation and Monitoring: Automates data pipelines and workflows, reducing manual effort and simplifying task management with comprehensive monitoring capabilities.
Improved Operational Efficiency: Centralizes data integration processes, eliminates data silos, and frees up IT resources for higher-value tasks, leading to cost savings and increased productivity.
Show more
Limited Customization & Control: Visual interface and pre-built transformations may not be flexible enough for complex ETL needs, requiring manual coding or custom Spark jobs.
Debugging Challenges: Troubleshooting Glue jobs can be complex due to limited visibility into underlying Spark code and distributed execution, making error resolution time-consuming.
Performance Limitations for Certain Workloads: Serverless architecture may not be optimal for latency-sensitive workloads or large-scale data processing, potentially leading to bottlenecks.
Vendor Lock-in & Portability: Migrating ETL workflows from Glue to other platforms can be challenging due to its proprietary nature and lack of open-source compatibility.
Pricing Concerns for Certain Use Cases: Pay-per-use model can be expensive for long-running ETL jobs or processing massive datasets, potentially exceeding budget constraints.
Show more
Steep Learning Curve: Complex interface and feature-rich platform require significant training and expertise, even for experienced data professionals.
High Cost of Ownership: Licensing fees, maintenance, and potential hardware/infrastructure upgrades can make it a costly solution for smaller organizations or simpler data needs.
Limited Out-of-the-Box Connectors: May require custom development or third-party tools for integration with certain data sources or applications, increasing implementation complexity and costs.
Performance Bottlenecks: Can experience slowdowns or scalability issues with very large datasets or intricate ETL/ELT processes, demanding careful optimization and resource allocation.
Limited Cloud-Native Functionality: Core features are primarily designed for on-premises deployments, with cloud options requiring additional setup and configuration, potentially hindering agility and flexibility.
Show more

User reviews of AWS Glue paint a picture of a powerful and user-friendly ETL tool for the cloud, but one with limitations. Praise often centers around its intuitive visual interface, making complex data pipelines accessible even to non-programmers. Pre-built connectors and automated schema discovery further simplify setup, saving users time and effort. Glue's serverless nature and tight integration with the broader AWS ecosystem are also major draws, offering seamless scalability and data flow within a familiar environment. However, some users find Glue's strength in simplicity a double-edged sword. For complex transformations beyond basic filtering and aggregation, custom scripting in Python or Scala is required, limiting flexibility for those unfamiliar with these languages. On-premise data integration is another pain point, with Glue primarily catering to cloud-based sources. This leaves users seeking hybrid deployments or integration with legacy systems feeling somewhat stranded. Cost also arises as a concern. Glue's pay-per-use model can lead to unexpected bills for large data volumes or intricate pipelines, unlike some competitors offering fixed monthly subscriptions. Additionally, Glue's deep integration with AWS can create lock-in anxieties for users worried about switching cloud providers in the future. Overall, user reviews suggest Glue shines in cloud-based ETL for users comfortable with its visual interface and scripting limitations. Its scalability, ease of use, and AWS integration are undeniable strengths. However, for complex transformations, on-premise data needs, or cost-conscious users, alternative tools may offer a better fit.

Show more

User reviews of InfoSphere Information Server paint a picture of a powerful data integration tool, capable of handling complex tasks and diverse data sources. Admiration for its robust ETL/ELT capabilities, data quality tools, and secure architecture echoes frequently, with users citing improved data accuracy and streamlined data movement as major benefits. Automation features and job monitoring are also praised for boosting operational efficiency and freeing up resources. However, the praise comes with caveats. The steep learning curve and demanding resource requirements are consistent gripes, making InfoSphere a better fit for larger organizations with dedicated IT expertise and infrastructure. The high cost of ownership, including licensing, maintenance, and potential hardware upgrades, further strengthens this point. Additionally, the limited out-of-the-box connectors and potential performance bottlenecks with massive datasets are concerns for some users. Compared to competitors, InfoSphere shines in its scalability and security, catering to high-volume, mission-critical scenarios. However, users also acknowledge the presence of simpler, more user-friendly options that might be better suited for smaller setups or less complex data needs. Ultimately, the choice boils down to individual priorities. If data volume, security, and advanced features are paramount, InfoSphere stands out, despite its demanding nature. But for those seeking a smoother learning curve or broader use cases, other solutions might offer a better fit. In essence, user reviews reveal InfoSphere Information Server as a powerful tool for complex data challenges, but its strengths come with a price tag and learning curve. Carefully evaluating data needs and priorities is crucial before choosing this data integration powerhouse.

Show more

Screenshots

Top Alternatives in ETL Tools


Azure Data Factory

Cloud Data Fusion

Dataflow

DataStage

Fivetran

Hevo

IDMC

Informatica PowerCenter

InfoSphere Information Server

Integrate.io

Oracle Data Integrator

Pentaho

Qlik Talend Data Integration

SAP Data Services

SAS Data Management

Skyvia

SQL Server

SQL Server Integration Services

Talend

TIBCO Cloud Integration

Related Categories

Head-to-Head Comparison

WE DISTILL IT INTO REAL REQUIREMENTS, COMPARISON REPORTS, PRICE GUIDES and more...

Compare products
Comparison Report
Just drag this link to the bookmark bar.
?
Table settings