Reducing 2AM headaches Part 3: Resiliency

December 29, 2011

The title of this series underscores our motivation for building a toolbox for system management, silencing the pager. In the first part of the series, we discussed the importance of standardization. We then talked about automation. As we conclude the series, we turn our focus on resiliency.

Operations management aims to keep failures to a minimum while increasing efficiency. The systems we manage are complex chains of interconnected processes that sometimes fail. The measure of how often these failures occur is known as the mean time between failures (MTBF). In a complex chained system, MTBF is combinatory; the mean for each component decreases the availability of the overall system. Parallel systems, unlike chained systems, have higher availability if the system is considered available when one of the two parallel members is up. Users want their MP3’s to play, their photos to print, their updates to reach their friends and families. The “nines” expresses availability as an amount of allowable downtime over a time period; how long users will put up with no lights or dial tone. “Two nines” would be 99% or about 4 days and “five nines” would be 99.999% or about 5 and half minutes a year.

Regardless of the system reliability math, the “belt and suspenders” mentality that pervades operations is based on the driving theme of uptime. Just about everything we do is to mitigate, prevent, manage or discover failures or failure-inducing conditions. Standard operating environments reduce the variables that could cause issues in practice or troubleshooting. Automation reduces human interactions that can introduce drift. Monitoring alerts us to changes from the expected system behaviors. Written methods of procedure ensure a clear understanding of actions taken during a maintenance window. Backups provide recovery from catastrophic failures. High availability clustering provides parallel environments to increase our reliability.

Recognize that failures are inevitable. The only thing we can control is our response.

Changing our vantage point

I'd like to offer up a challenge: where we've been architecting around reliability, we should be building for recovery. Mean time to recover (MTTR) is the measure of restoring service, which quite frankly, is more important than eliminating potential failure points. Service Level Agreements (SLAs) and, perhaps more importantly, the corresponding penalties are based on downtime metrics, not uptime. This makes recovery time our keystone for designs, not reliability. A simple system with fewer components that fails once a month and takes five minutes to restore service has the same uptime 'score' as a highly complex fail-over architecture that accumulates twenty 30-second interruptions over the same period.

Taking the Conversation Off Road

Changing to a service resiliency model instead of a failure survivability model impacts the choices we make in architecture and our tooling. Let's look outside the software arena and examine two prominent auto manufacturers who approach this idea in different ways: Jeep and Rolls Royce.

Designed for the US military, Jeep was mainly built using off-the-shelf automotive components as an all-terrain recon vehicle. The design allowed for quick modification and repair so that a modern Jeep can be disassembled and reassembled by an Army drill team in under four minutes. Showy perhaps, but it gets to the core of the design. Given the use case, part failure is inevitable, so Jeeps need to be easily recovered.

On the other hand, the iconic, British luxury car Rolls Royce is designed around long duty cycles, for less harsh conditions. Specialized electronics, engine and interior components are built around increasing lifecycle reliability. While this can result in long and expensive repairs, Rolls Royce has earned a brand reputation for the highest quality of the parts.

Software Engineering Tools

From the design table to the shop floor, it’s clear these two automotive icons have major differences. For example, tolerances on parts and assembly, material choices, methods and tools used during assembly are widely varied based on the guiding choice of recovery versus reliability. The automotive engineering analogy holds true in software engineering as well. We care about different things if a single component failure in the chain doesn't bring our application to a grinding halt. These design choices also change as new technologies emerge. For instance, virtualization provides new options for recovery and resiliency while cloud computing offers a similar, but distinct set of options and challenges.

Many of the tools remain the same, but their application to our design choices will change. Monitoring will alert us to problem components that need to be investigated and returned to the pool of available resources. Automated configuration management system can correct drift without human intervention. Increases in load can be handled elastically in response to load balancers within pools of available resources.

There are other factors to be sure; inefficient processes for getting the right people involved probably wastes the most time in an outage. But a recoverable system that allows techs to work around and repair the failed component without impact to service availability will tack on that extra 9 much faster.

Sounds like the cloud, neh?

Recovery

SOE

Service Resiliency Model

Standard Operating Environment

Use of Technology

Modern Solutions for Public Works: Tackling Wastewater Overflows With Smart Technology

Numerous cities in the United States struggle with wastewater issues. Many cities’ systems are designed to accommodate smaller populations, and historical rainfall patterns are increasingly prone to causing overflows – where wastewater spills into drinking water sources, streets and homes. And many cities utilize mostly combined wastewater systems where wastewater and rainwater both drain through the same infrastructure, creating increased stress on city systems during storms.

Gabriel Zighelboim

Thousands of interconnected lights above the Earth

Federal Government, IT Perspective, State & Local Government September 21, 2022

Why Government CIOs Need to Rethink Their Tech Procurement Strategy

As government agencies and organizations look to modernize their technology stacks to keep up with changes in the workforce, aging solutions, and closing contracts, they’ll all set out with a similar process: submit an RFP, review submissions, and choose a vendor. Seems simple enough. But what government CIOs often don’t realize is that requiring proven, specific use cases may be limiting what their new (and likely expensive) technology investment can do for their organization. Here’s what I mean.

Vishal Hanjan

Locked padlock superimposed over the American Flag

Cloud Computing, Cybersecurity, Federal Government, IT Perspective June 21, 2022

All Roads Lead to Federal Zero Trust

Over the last few years, the federal government has begun to embrace a zero trust approach as the new cybersecurity standard for agencies. Utilizing the latest solutions and best practices, the hope is to bolster federal cybersecurity and create a robust and resilient IT infrastructure that can protect and secure networks from attacks and breaches.

Kevin Tierney

Unlocked blue padlock with points of light surrounding it

Cloud Computing, Cybersecurity, IT Perspective, Technology June 20, 2022

Microsoft’s Federal Security CTO on the Impact of OMB’s Zero Trust Strategy

Last January, the Office of Management and Budget (OMB) released M-22-09, a memorandum that set forth the federal government strategy on zero trust adoption, in an effort to reinforce the security and protection of government agencies’ critical systems, networks, and IT infrastructures.

David Presgraves

Binary numbers against a prism background

Application Lifecycle, Federal Government, IT Infrastructure, IT Perspective April 28, 2022

Federal IT Modernization Post-COVID: The Key to Speed is Low-Code - Part 1

In the post-COVID world, the federal government spends about three-fourths of its technology budget maintaining aging computer systems including platforms more than 50 years old and even some that use floppy disks, according to a recent Government Accountability Office report.

Roland Alston

Cloud Computing, IT Perspective, Tips and How-Tos April 27, 2022

What’s Next for Cloud in 2022 (and How Can You Prep for It?

What’s next for cloud in 2022 (and how can you prep for it?) There were so many noteworthy AWS re:Invent 2021 announcements out of this year’s big Amazon Web Services (AWS) conference. But what will that news mean for the year in cloud ahead? And how can learners and businesses prepare for the opportunities these will create?

Pluralsight

3 people staring at a display of code and graphs

Application Lifecycle, Cybersecurity, DevSecOps, Federal Government, IT Perspective, Technology April 22, 2022

Tech Transforms podcast: Web 3.0, Gamification, CIA Innovation, Getting Ahead of the Adversary

On the Tech Transforms podcast, sponsored by Dynatrace, we have talked to some of the most prominent influencers shaping critical government technology decisions. From supply chain to machine learning, this podcast explores the way technology advancement intersects with human needs. In March 2022, we sat down with these government technology visionaries:

Carolyn Ford

IT Perspective March 24, 2021

TMF Funding Provides Opportunities for Cybersecurity Investment

The Technology Modernization Fund (TMF) recently received a much-needed influx of funds, bringing its total to $1 billion. This money is a small part of the funding for technology upgrades in the government, and a very small part of the overall COVID relief bill of which it was a component. The bill does not indicate how the money is to be spent but for most observers modernization is almost equivalent to cloud adoption, with cybersecurity a close second. While most observers accept that the U.S.

Don Maclean

Application Lifecycle, IT Perspective August 26, 2020

Transforming Federal Procure-To-Pay Processes With Proven Automation Solutions

Blog originally posted by Federal News Network here.

Savanna Evans

IT Perspective January 6, 2020

Tech Trends to Keep an Eye on in 2020

2019 has ended with more uncertainty than normal—even than the federal government is used to. Last year at this time, of course, Christmas brought the advent of a record-long lapse in appropriations for about half the departments and agencies. The exceptions of Homeland Security, Defense and Veterans Affairs kept IT dollars flowing, but the partial shutdown left its mark nonetheless. The ugly impeachment process working its way down the hall from the house to the Senate might be a psychic distraction but will have no effect on IT procurement.

Chris Wilkinson

IT Perspective December 16, 2019

SolarWinds Orion Suite v4.0 Undergoes Common Criteria Evaluation

SolarWinds (NYSE:SWI), a leading provider of powerful and affordable IT management software, today announced that the SolarWinds® Orion® Suite for Federal Government v4.0 is undergoing evaluation for Common Criteria to Evaluation Assurance Level (EAL) 2+ under the Netherlands Scheme for Certification in the Area of IT Security (NSCIB).

DLT Solutions

Uncategorized November 12, 2019

Boost Your Skills and Career with Autodesk Certification at AU 2019

Article originally posted on GovDesignHub here. Autodesk University (AU) returns to Las Vegas from November 19-21 – and we have some good news. In addition to discounted conference passes now available on GSA Schedule, Autodesk Certification exams are back at AU 2019!

Caron Beesley

Uncategorized November 5, 2019

Safeguarding and Monitoring Your Agency’s IT Environment

By Mav Turner, VP, Product Management, SolarWinds For federal IT pros, moving to a cloud environment is a “when” rather than an “if” proposition. From the government’s recently released Report on IT Modernization, calling for agencies to identify solutions to current barriers regarding agency cloud adoption, to the White House’s draft release of a new “Cloud Smart” policy, which updates the “Cloud First” policy introduced in 2010; cloud migration continues to be a priority.

DLT Solutions

IT Perspective September 18, 2019

How to Leverage SolarWinds Orion Suite for Federal Government v3.0

The SolarWinds Team at DLT is excited to announce after rigorous testing, the SolarWinds Orion® Suite for Federal Government v3.0 has been placed on the Department of Defense Information Network (DoDIN) Approved Products List (APL). The Orion Suite also has Common Criteria certification. The Orion Suite for Federal Government v3.0 includes:

AndreJones

Federal Fiscal Year End, IT Perspective September 12, 2019

5 Tips to Help Procurement Officials Sail Through Federal Fiscal Year-End

The month of September marks the busiest buying season for the federal government. In the final month of fiscal year 2018, an astonishing $97 billion was spent on 509, 828 contracts. On average, this equates to $3.2 billion per day. September is also getting busier and busier. Between 2015 and 2018 spending increased by 39%.

Brian Strosser

Federal Fiscal Year End, Uncategorized September 11, 2019

During FFYE, Good Requirements Help Both Government and Industry

The old business adage runs, “Nothing happens until somebody sells something.” To which you might add this corollary: nothing good happens in the absence of strong requirements.

Brian Strosser

IT Perspective September 10, 2019

It’s September – Do You Know Where Your Federal Funding Is? Or Deals Are?

It’s the most wonderful time of the year…as the song goes and that is also true of the U.S. Federal IT market right now. The month of September marks the end of the fiscal year and the beginning of the federal government’s annual spending frenzy. Federal agencies scramble to spend what’s left in their budgets, in fear that leaving excess funds will prompt Congress to send less in the following year. We call it “use it or lose it” spending, and it happens every year.

DLT Solutions

IT Perspective August 21, 2019

FFYE Is Here: DLT Can Help You Win Your Share of Year-End Dollars

The days are getting shorter, and, as the end of September approaches, the window of opportunity for technology providers to capitalize on federal fiscal year-end sales opportunities is shrinking. September 30th marks the end of the federal government’s fiscal year which means the run-up to month-end is a busy one as contractors vie for federal “use it or lose it” dollars.

Brian Strosser

Cybersecurity, IT Perspective August 2, 2019

Defend Against Insider Threats With User Access Management

Every Federal IT pro knows that security threats are a top agency priority. Yet, according to the SolarWinds 2019 Cybersecurity Survey, those threats are increasing—particularly the threat of accidental data exposure from people inside the agency.

DLT Solutions

IT Perspective, Uncategorized July 11, 2019

New Cyber Scoring Drags Down Agency FITARA Scores

The latest data on the progress of federal government agencies’ implementation of the Federal Information Technology Acquisition Reform Act (FITARA) was released on June 26 by the House Oversight and Reform Committee as Scorecard 8.0.

Melissa Perez

IT Perspective, Uncategorized July 11, 2019

New Cyber Scoring Drags Down Agency FITARA Scores

Melissa Perez

IT Perspective July 11, 2019

Exclusive Interview: Red Hat Chief Technologist Talks OpenShift Release

DLT Solutions recently sat down with Red Hat Chief Technologist, North America Public Sector, David Egts, to discuss the recent release of Red Hat OpenShift. DLT: So David, tell us a little about what you do at Red Hat.

DLT Solutions

Cloud Computing, Uncategorized July 10, 2019

The Sky is Not the Limit: Takeaways from the AWS Public Sector Summit

Key takeaways show how public sector customers are achieving more with cloud. As cloud continues to transform the public sector, cloud has had its own metamorphosis: from a trendy buzz word to a catalyst for meaningful change, innovation, and more. Last month, AWS hosted its 10th annual AWS Public Sector Summit. The conference brought together more than 17,000 attendees for 2+ days of insights, sessions, and networking, and explored how cloud is fueling the public sector for a limitless future. Recap

Isabella Jacobovitz

Infrastructure, IT Perspective June 27, 2019

Maximize the Performance of Your Microsoft Ecosystem

A vast majority of government networks are driven by Microsoft products, from Office 365 to the Azure cloud platform. It should come as no surprise, then, that more and more agencies are looking for tools to monitor Microsoft systems more effectively—all through a single pane of glass. The good news is there are ways to make the most of existing Microsoft technology with complementary monitoring strategies that will meet the needs of the federal IT operations security teams, SysAdmins, DevOps pros, and managers.

DLT Solutions

IT Perspective June 6, 2019

Podcast: How to Grow Your Government Business with Distribution Partners

Bringing government-ready solutions that solve the unique needs of the government in addition to having superior public sector expertise is our bread and butter. But DLT is also a powerhouse government distributor and a vital part of the government marketing ecosystem. But what does that mean for OEMs and IT companies seeking to break into or grow their public sector market footprint?

DLT Solutions

Uncategorized June 4, 2019

Leverage DLT's Growing Portfolio During State Fiscal Year-End

Many states' fiscal years are quickly coming to an end, and at DLT we’re committed to making the job of the procurement officer as easy as possible as they scramble to make smart and responsible purchasing decisions with remaining taxpayer dollars. Part of this process is raising awareness of what’s new in our extensive portfolio of IT solutions including big data and analysis, cloud, cybersecurity, application lifecycle, digital design, IT consolidation and management, and more.

Brian Strosser

IT Perspective June 4, 2019

5 Ways Tech Vendors Can Make It Through State Fiscal Year-End with Their Sanity Intact

If you’re a technology solutions vendor, you’re about to enter a crazy busy time of year. It’s state and local government fiscal year-end season! The days are getting longer, and time is running out to close those year-end sales deals before June 30th. Work is already a top source of stress for many Americans – you don’t need anymore. So, we’ve compiled some stress management techniques that will help you thrive and survive SLED fiscal year-end 2019!

Brian Strosser

IT Perspective June 2, 2019

Success in State and Local Fiscal Year-End Starts with Careful Marketing and Sales Strategies

The state and local government IT market has always been something of a chimera to marketers. At $130 billion annually, it’s a third larger than federal. But it’s a fragmented market, given that it consists of 50 states and more than 3,000 counties. Throw in the large cities from Boston to Los Angeles and it becomes even harder to access.

Brian Strosser

Uncategorized May 7, 2019

What is a Master Government Aggregator?

Earlier this year DLT announced they were selected as the “Master Government Aggregator”

DLT Solutions

Business Applications, Uncategorized May 7, 2019

How BIM Makes the IPD Model Possible – a Q&A with BuroHappold

Not all Ivy League schools have massive endowments and bank accounts. Some have to get more creative when looking to build new facilities on a budget – or simply embrace innovative new approaches to design and construction.

DLT Solutions

Business Applications, Uncategorized May 7, 2019

What’s New in AutoCAD 2020 for Govies?

It’s that time of year again! Spring is here and Autodesk has commenced its steady roll-out of 2020 software releases. First off is AutoCAD 2020. Released in late March 2019, AutoCAD 2020 includes interesting and exciting new features. With a subscription to AutoCAD 2020, you’ll get industry-specific toolsets; improved workflows across all your devices – web, mobile, and desktop; and new integrations with cloud storage vendors. Here’s a round-up of what’s new.

Kirk Fisher

IT Perspective April 15, 2019

Aiding Modernization is Key to Future IT Sales

Congress and the Trump administration may not agree on much, but everyone wants to keep pressing federal agencies to strive for modernization with their information technology systems. In hearings just this month, for example, members of the House Veterans Affairs Committee bore in on VA’s struggle to replace its electronic health record system and to modernize its legacy financial and other administrative systems.

DLT Solutions

IT Perspective April 1, 2019

The True Meaning of Solutions Aggregation

For nearly 30 years, DLT has grown to be one of the nation’s top providers of industry-leading IT solutions for the public sector. The strategic decision to focus solely on the public sector market has cultivated trust and assurance with our technology vendors that DLT is their go-to expert. With the evolution of technology and the emphasis on modernizing government, the public sector is changing how it procures new technologies by leveraging consumption-based and as-a-service delivery models.

Art Richer

IT Perspective February 7, 2019

What Technology Skills Were Popular in 2018 and Today?

There is always something new to learn in the world of information technology, and organizations must act fast to keep pace with the fast pace of change. But where should your agency spend its training dollars to upskill its talent to match your evolving technology strategy? DLT’s technology partner, Pluralsight, has the answer.

AndreJones

IT Perspective January 30, 2019

Making the Switch from DevOps to DevSecOps

DevOps became part of the fashionable lexicon for software development a few years ago. The government, at least here and there, has adopted the concept enthusiastically. More recently and with growing urgency, the syllable “Sec” – for security – has joined the DevOps concept. Many federal IT shops call it DevSecOps.

Brian Strosser

IT Perspective January 11, 2019

How the Government Shutdown is Being Felt Beyond the Beltway

The ongoing 2018-2019 government shutdown which directly impacts 800,000 federal workers is on its way towards being the longest in history with President Trump saying it could continue for months or years. The longer the shutdown lasts the wider the impacts to government systems, programs, citizens and businesses. Here are some expected and unexpected fallouts of the current government shutdown. Muddying an Already Complicated Tax Season

Brian Strosser

Digital Design, Uncategorized December 13, 2018

BIM Dominates Top Five GovDesignHub Articles of 2018

When we launched GovDesignHub in the spring of 2018, we had one goal in mind – to address the lack of resources, discussion, and analysis available online for those who practice in government digital design ecosystem. Today, we’re proud to be the only website that showcases government design projects and the technology used to support them and deliver content to help public sector organizations accomplish their missions. In the words of one of our top contributors, Lynn Allen, of Autodesk fame:

Holly Chapman

IT Perspective October 31, 2018

DLT Chief Technology Officer Discusses Company's Evolution to a Solutions Aggregator

Last month, we sat down with DLT Chief Technology Officer David Blankenhorn to get his insight on DLT's evolution to a government solutions aggregator. Tell me a little bit about DLT’s movement to an aggregator model. Why is that important?

DLT Solutions

Federal Fiscal Year End, Uncategorized September 12, 2018

Tom Temin: Understanding the Color of Federal Money

Congress first enacted federal appropriations law in 1809. It’s kept lawyers, contractors, and judges busy ever since. A question arising in many sellers’ minds at this time of year is, what money is available for contracts in more than one fiscal year?

Tom Temin

IT Perspective September 6, 2018

Make Smart FFYE Buying Decisions with DLT’s Growing Vendor Partner Portfolio

With federal fiscal year-end (FFYE) right around the corner, at DLT we’re committed to making the job of the procurement officer as easy as possible as they scramble to make smart and responsible purchasing decisions with remaining taxpayer dollars.

DLT Solutions

IT Perspective August 31, 2018

Tom Temin Looks Ahead: Time to Sharpen Your Federal Sales Plans for 2020

One thing we know about fiscal 2019: There will be plenty of money to go around. The hyper-partisanship that characterizes the government’s political class means that for the second year in a row, there’s more money for guns and butter.

Tom Temin

Federal Fiscal Year End, IT Perspective August 28, 2018

5 Ways Technology Providers Can Manage Stress this Federal Fiscal Year-End

If you’re a technology solutions provider, we hope you had a great summer and managed to squeeze in some time off, because the busy season is here, and September has already been a frenzy. Work is already a top source of stress for many Americans, but right now government procurement teams face enormous pressure as they rush to negotiate and award contracts while ensuring regulatory compliance before the federal fiscal year ends on September 30th.

DLT Solutions

IT Perspective August 21, 2018

Exclusive Interview: Steve Wells Discusses DLT Contract Expertise and Reflects on Military Career

This week we sat down with DLT Senior Director of Program Management, Steve Wells, to discuss how DLT's contract expertise can be beneficial to its technology company clients and public sector customers, particularly as we approach the end of the Federal Fiscal Year. We also discussed Steve's military career, of which he just celebrated a major milestone. Interviewer: Hi Steve, thanks for sitting down with us today. To start, why don’t you tell us a little bit about what you do at DLT.

DLT Solutions

IT Perspective August 15, 2018

How to Take Advantage of Federal Fiscal Year-End as a Solutions Provider

An irony of late appropriations – as the federal government experienced for the umpteenth time in fiscal 2018 – is that rather than rush to spend, your federal customers are actually spending at rates below what they’re authorized to spend. That makes it harder to maximize the year-end spending blitz. It takes some doing, but if you’ve got the fortitude to plow through reports from the Bureau of the Fiscal Services and the Congressional Budget Office (CBO), it’s possible to discern that agencies simply might not have the time and manpower to execute on every program.

Tom Temin

Digital Design, Uncategorized May 22, 2018

City of New York Champions Design and Construction Excellence with the Power of BIM

New York City Department of Design and Construction (NYC DDC) handles some of the most exciting and dynamic architectural and infrastructure challenges in the world.

Grace_Bergen

Cybersecurity, IT Perspective, News, Technology April 30, 2018

Federal Agencies are Playing a Game of Hope with Two-Factor Authentication

Shortly after the federal government suffered it’s largest and costliest data breach ever at the Office of Personnel Management (OPM), a post-mortem analysis found that the breach was entirely preventable, and the exfiltration of security clearance files of government employees and contractors could have been prevented through the implementation of two-factor authentication for remote log-ons.

BradleyGernat

Cybersecurity, IT Perspective, Technology, Tips and How-Tos April 23, 2018

What Agencies Need to Consider When Updating Password Protocols in 2018

Movies and TV would have us believe that data breaches are long, hard-fought battles between the good guy and the bad guy—and the bad guy wins. That could not be further from reality. Hackers are opportunistic. They want to spend as little time as possible getting into the system, getting what they need, getting out, and exploiting it as fast as possible.

Paul Parker

Digital Design, IT Perspective, News, Technology, Tips and How-Tos, Uncategorized April 20, 2018

Meet GovDesignHub: At the Crossroads of Public Sector and Digital Design

The CAD and digital design sector is vast and growing at breakneck speed. It’s expected to reach $11.21 billion by 2023. Many end-use industries such as automotive, aerospace, entertainment, industrial machinery, civil and construction, electrical and electronics, pharmaceutical, and healthcare, consumer goods, and others, widely use CAD and 3D design-based tools in their development processes.

DLT Solutions

Digital Design, IT Perspective, News, Technology, Tips and How-Tos, Uncategorized April 20, 2018

Meet GovDesignHub: At the Crossroads of Public Sector and Digital Design

DLT Solutions

CAD General, Digital Design, IT Perspective, Technology, Uncategorized April 11, 2018

Revit 2019 is Here. It Has Everything You Asked For.

Aside from developing one of the sought after building information modeling (BIM) software in the industry today, you’ve got to hand it to the Revit product managers over at Autodesk—they listen.

Grace_Bergen

Reducing 2AM headaches Part 3: Resiliency

Related Blog Posts