Hadoop Developer Resume
Baltimore, MD
SUMMARY
- Have 5 Years of experience in Design and Development, 了解Hadoop管理活动,如安装, configuration and maintenance of the cluster.
- 具有Hadoop及其相关技术- Hive的实践经验, Pig, Sqoop, Oozie, Flume and MapReduce.
- 熟悉Hadoop框架,包括Hadoop生态系统, MapReduce, Pig, Hive, Flume, Spark, ZooKeeper, Oozie and Impala.
- 熟练掌握信息动力中心(管理)ETL开发, Designer, Workflow Manager, Workflow Monitor, Repository Manager, Metadata Manager) for Extracting, Cleaning, Managing, Transforming and Loading data.
- 熟悉Hadoop框架下的大数据架构和通信系统.
- Good Programming experience with SQL, PL/SQL数据库技术及其关系数据库,包括Oracle, Teradata, MS-SQL.
- 熟悉NoSQL数据库- HBase, MongoDB和Cassandra.
- 数据仓库ETL经验(提取、转换和加载).
- 有使用Java, Python编写MapReduce程序进行数据处理和分析的经验.
- Startup and shutdown scripts, crontabs, 使用Shell脚本(BASH)编写脚本和自动化, KSH) in LINUX.
- 配置Spark Streaming接收来自Kafka的实时数据,并将流数据存储到HDFS
- 熟悉使用Hive和Pig编写Hadoop job进行数据分析.
- 有使用数据管理工具Sqoop导入/导出数据的经验.
- Good Knowledge in streaming the data to HDFS using Flume.
- 有Apache Hadoop管理,Linux管理的实践经验.
- Experienced in BigData storage and File System Design.
- Experience in Design, Installation, Configuration, Support & managing the Hadoop Clusters.
- 具有使用Apache Hadoop和BigData开发MapReduce应用程序的经验.
- 有编写Hadoop脚本、MapReduce程序的经验吗.
- 使用Flume工具将多个源的日志文件直接加载到HDFS中.
- 对Hadoop平台和其他分布式数据处理平台有较强的了解.
- Experienced in Software Development Life Cycle (SDLC), application design, functional and technical Specs, and use case development using UML.
- Experience in web services using XML, SOAP and HTML.
- 有设计和维护所有脚本和自动化过程的系统工具的经验吗, monitor all capacity planning.
- Excellent interpersonal, 具有良好的沟通技巧和团队合作精神,愿意接受新的和不同的项目,并有能力处理不断变化的优先级和截止日期.
TECHNICAL SKILLS
Hadoop Tools: HDFS, MapReduce, Pig, Hive, Flume, Oozie, Zookeeper, HBase, Ambari, and Sqoop, Databases Oracle 9i, MYSQL
Languages: Java, J2EE, SQL
Operating System: Windows 8, Linux, Unix
Development Tools: Eclipse, MYSQL
Web Technologies: VMware, JSP, Servlets, JDBC, JavaBeans
Databases: Oracle 11G, DB2, MS SQL Server 2000, 2005, 2008
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential, Baltimore, MD
Responsibilities:
- Imported Bulk Data into HBase using MapReduce programs.
- Wrote multiple java programs to pull the data from HBase
- 有分析业务功能及其需求的经验.
- 有修改Hive Scripts以允许Tokenization加密数据的经验.
- 创建HIVE脚本,用于创建表、数据摄取和处理HDFS数据.
- Involved in creating Hive QL tables, loading data and writing Hive QL queries, which invoked and run MapReduce jobs in the backend.
- 在Pig中有处理复杂数据类型的经验,如元组和映射.
- 有使用Oozie为常规工作创建工作流和协调器的经验,以及自动将数据加载到HDFS的任务.
- 参与从各种来源提取数据,并利用MapReduce框架等生态系统处理静态数据, HBase, Hive, Oozie, Flume, Sqoop etc.
- 为了交换信息,我们使用了web服务SOAP和REST.
- 编写多个Hive job对日志进行解析,并将其结构化为表格形式,便于对日志数据进行有效的查询.
- 有在任务/作业运行时跟踪作业状态的经验.
- 使用HBase API对HBase中存在的时间序列数据进行分析.
- 有使用组合器和分区优化MapReduce算法以获得最佳结果的经验,也有为HDFS集群优化应用程序性能的经验.
Environment: Hadoop, HDFS, MapReduce, Hive, Sqoop, HBase, Apache Spark, Oozie Scheduler, Java, Unix Shell Scripts, Git, Maven, PL/SQL, Python, Scala, Cloudera.
Hadoop Developer
Confidential, Plano, TX
Responsibilities:
- 对Hadoop架构和HDFS等组件有较强的知识, Name Node, Data node
- 资源管理器、节点管理器和YARN/MapReduce编程范式.
- 通过Cloudera Manager监控Hadoop集群并根据错误信息实现警报的经验.
- 向管理层提供集群使用指标报告,并向客户收取使用费用.
- 使用ETL Tools从多个数据源导入数据以执行转换.
- Tested the raw data and executed the scripts, by distributing the responsibilities to Hadoop, Pig and Hive.
- 创建Hive查询,通过将新数据与EDW参考表和历史指标进行比较,帮助市场分析师发现新兴趋势.
- 有经验分配MapReduce集群的映射器和减少器的数量.
- 编写Shell脚本以监视Hadoop守护进程服务的运行状况的经验, quick in fixing error messages/ failure conditions.
- Monitored multiple Hadoop Clusters environment using Ganglia.
- Used Flume to collect, aggregate, 并存储来自web服务器等不同来源的web日志数据, mobile and network devices and pushed to HDFS.
- 使用Sqoop将分析的数据导出到关系数据库中进行可视化,并由BI团队生成报告.
- 对日志数据执行MapReduce程序,转换为结构化的方式查找用户位置, age group, spending time.
- 配合应用团队安装操作系统Hadoop更新补丁,根据需要进行版本升级.
Environment: Big Data/ Hadoop, Spark, HDFS, MapReduce, Hive, Pig Sqoop, Flume, Impala, Oozie, Ganglia, Java, and DB2.
SQL Server Developer
Confidential
Responsibilities:
- Interacted with Team and Analysis, Design and Developed Database using ER Diagram, Normalization and relational database concepts.
- Involved in Design, Development and Testing of the system.
- 开发SQL Server存储过程,调优SQL查询(使用索引和执行计划).
- Developed User Defined Functions and Created Views.
- Created Triggers to maintain the Referential Integrity.
- Implemented Exceptional Handling.
- 根据客户需求编写复杂查询以生成水晶报表.
- Creating and automating the regular jobs.
- 使用执行计划和分析器调优和优化SQL查询.
- 重新构建索引和表,作为性能调优练习的一部分.
- Involved in performing database Backup and Recovery.
Environment: SQL Server 7.0/2000, SQL, T-SQL, Visual Basics 6.0/5.0, Crystal Reports 7/4.5