Job type: Full-time
Role: DevOps, System Administrator
python, ansible, sysadmin
Comcast's Technology & Product organization works at the intersection of media and technology. Our innovative teams are continually developing and delivering products that transform the customer experience. From creating apps like TVGo to new features such as the Talking Guide on the X1 platform, we work every day to make a positive impact through innovation in the pursuit of building amazing products that are enjoyable, easy to use and accessible across all platforms. The team also develops and supports our evolving network architecture, including next-generation consumer systems and technologies, infrastructure and engineering, network integration and management tools, and technical standards.
As an SRE on the CDN team you'll be using tools like Ansible and Python to engineer solutions for infrastructure and application deployment related issues. You'll be bridging the gap between sysadmin and developer as you work to move our infrastructure support and automation to the next level.
TheCDNSREteam is a small and fast-moving team of world-class experts who are innovating to provide a robust platform serving our IP Videodeliverycustomers, both internal and external. We are a team that thrives on big challenges, results, quality, and agility.The CDNSRE teamispositioned between theCDN Development organization, NOC/CNOCfor "Eyes on Glass" Monitoringandapplications/stakeholders leveraging CDN platformto provide 24x7 support withthehighestlevel of serviceavailability. We work with network engineers, systems administrators, software engineers, and a pile of technically-adept-but-not-actually-technical product folks. TheCDN SREteam is a diverse collection of software engineers, systems administrators, network administrators,andconfigurationmanagersfrom all walks of life. We're a group of experienced technical minds who are the last word when it comes to solving problems encountered by ourproduction support teams.
In your role, you will bridgethe gap between development/engineering and NOC/CNOC for production support. Develop innovative ways to perform upgrades, deployments in production to scale out for CDN growth across Comcast footprint. As SiteReliabilityEngineer, your mission would be to stay on top of planned maintenances and drive solutions to reduce downtime and manual maintenance effort.Query big data stores (Splunk,ELK) to quantify the scope of reported issues. Create new metrics and identify monitoringsolutionsto improve site/servicereliability. Evaluate new code releases for basic reliability and systems integration support, providing guidance for quality driven delivery to production. You'll be working with the best and brightest minds in Comcastwhile we roll out theIP CDNinfrastructure that will power the next generation ofIP Video, gaming industry. Success in this role is characterized by a higher speed to market, with fewer customer-facing defects and outages.
Perform upgrades and deployments to production
Identify root cause and deliver solutions with permanent fix for production incidents
Co-ordinate with network engineering and data center teams for hardware break-fix
Deliver innovative pro-active self-healing capabilities
Develop automated deploymentsolutionthrough CI/CD
Work independently as part of an agile team
Investigate new technologies to improve operational efficiency
Enable team to successfully achieveservice level objectives in terms of high availability, capacity growthandmaintaining performance of overall CDN platform
Provide 24x7 on-call supportfor production incidents
Skills & Requirements
Strong understanding ofNetworkingConcepts- TCP/IP, DHCP basic routing
Experience withconfiguration managementtools (Ansible, Puppet,Chef)
Experience with a variety ofprogramminglanguagesincluding, but not limited to:Go, Perl,Python, Shell/Bash, Groovy
Version Control usingGitHub,Gitlabor SVN
Strong understanding ofcontent delivery network through use offorward/reverse proxies and caching hierarchies
Familiarity withELK, Splunk, OP5/Nagios,RabbitMQ,Grafana,InfluxDB is a plus
Strong communication and interpersonal skills
Willing to take ownership of problems and see them through to resolution
A bachelor's degreein Computer Scienceor Engineeringand some professional work experience or equivalent work experience.
About Comcast IPCDN:
IPCDN / Traffic Control, is a quasi startup division within Comcast's Technology and Product Division and spun out from IP Video and online projects originated within Comcast Interactive Media is based in downtown Denver, CO. We are an open source based (Want to learn more? http://Visithttps://trafficcontrol.apache.org),IP content delivery infrastructure that's been built to deliver a broad mix of on-demand video, live TV streams and an assortment of other digital media to an array of connected devices in the home.
About VIPER (Video IP Engineering & Research)
The VIPER team is an organization within Comcast Technology Solutions which supports the Product Management, Research & Development, Engineering and Operations for Comcast's World Class Video Experiences in use by Comcast and Syndicated Partners. We support both legacy QAM Video Delivery and the Next Generation IP Video Platform infrastructure from Content Acquisition to the Player Consuming experiences we control the end-to-end Video delivery platforms. These platforms deliver video for live linear, video on-demand and cloud DVR services consumed by more than 20 million customers both in-home and out of home on stb's, connected tv, mobile and desktop products.
Comcast is an EOE/Veterans/Disabled/LGBT employer