Data Center Transformation

Data Center Transformation and IT Infrastructure Management

Data Center Transformation

With more and more Organizations worldwide transforming into Knowledge Warehouses, there is an immediate need to handle critical data in a simpler, flexible and more efficient manner without compromising on security and data loss. Mindteck's expertise in this domain can help any organization to enhance and optimize existing assets, lower operation costs, improve performance and meet service levels with greater efficiency.

How is Data Centre Transformation helpful?

  • It transforms your existing IT infrastructure into shared and virtualized groups of resources that provides a centralized control.
  • Reduces maintenance times and cost due to the automation of patching and support of data center services.
  • Improves service availability to business thereby increasing productivity of resources.
  • Data center transformation implements methods which will make an organization capable of meeting the current and future needs of managing IT infrastructure with consideration to environmental factors such as power saving and reducing carbon footprints.

IT Infrastructure Management

Mindteck Infrastructure team has an extensive portfolio of technical skills. The experience and skills span multiple products, technologies, vendors and platforms. Its depth of technical skills ensures that IT incidents are handled efficiently and effectively. With a well-structured escalation matrix that maps onto high level of skills, Mindteck IT team ensures high quality of service and very high levels of availability of the IT infrastructure.

Local and remote IT support services: Mindteck's standard centralized support process flow activities include call management & problem resolution activities. Mindteck has developed a Ticket Resolution process flow as part of our Incident Management System which is based on the ITIL framework. This process is based jointly upon industry best practices and our prior experience in implementing support delivery models very similar to that being proposed for this engagement. In our approach, we lay emphasis on customer satisfaction and have built in customer feedback mechanisms as part of the resolution procedure. The objective is to provide L1, L2 and L3 support to resolve issues, as reported by the user, which requires all troubleshooting and analysis of the issues reported. The support includes end user assistance with hardware, software and networks and any internal business applications.

 

Mindteck Capability

 

SLA Based Delivery

Mindteck offers SLA based Infrastructure Management services. Our SLA based offering removes the uncertainty on service continuity and guaranteeing high uptime.

Mindteck has an in-depth understanding of IT Infrastructure and matured capability of ITIL process to handle similar kind of support on experience in design, deployment and management of enterprise infrastructure.

 

Support Process

 

Service Delivery Process

Service Delivery for Technical Application Management Services shall use ITIL framework and is measured and reported using the following Service Level Targets:

  • Incident Response & Resolution
  • Problem Management
  • Service Request Management
  • Application Availability
  • Technical Change Request Management
  • Release Management
  • Capacity Management
 

The Service Desk can be reached by logging a ticket in helpdesk tool and under special circumstances by phone and email by select authorized users. The function of this Service Desk is to log the call administration and escalation. Tickets and request shall be serviced according to their priority.

Tickets can only be logged by or for Authorised Callers, unless specific permission is granted by the Business Unit Manager. All interactions with Authorised Callers shall be registered in the helpdesk tool system. Users shall be notified when a ticket has been assigned, resolved and closed.

Incident Response & Resolution

An Incident is any event which is not part of the standard operation of the service and which causes, or may cause, an interruption or a reduction of the quality of the service. Incident Management takes the necessary measures to restore normal operations as quickly as possible with the least possible impact on either the business or the user. If the incident management measures do not address the issue structurally, the issue will be taken over by the problem management process.

The table below specifies the committed response and resolution times for the various incident priorities. There is no commitment for P5, since the incidents reported in this category do not constitute a degradation or loss of service.

The Response Time is measured as the time interval between when the incident is logged and the incident starts being worked on. The incident is considered as being worked when the status changes to assigned. Response Time is measured based on tickets that are opened in the defined review period.

Priority Description Response Time
P1 Critical - Service unavailable, device error or other major fault that makes user inaccessible 15 minutes
P2 High - Business functionality down, cannot perform necessary tasks, possible temporary workaround 30 minutes
P3 Medium - Performance impacted, possible workarounds 1 hour
P4 Low - Inconvenience 4 hours
P5 Very Low 1 Day

The Resolution time, is defined as the total of the elapsed work-in-progress time from call open to service restoration. Tickets with the status "Pending" are excluded from this SLA. Resolution Time shall be suspended (kept in Pending) in event of a disaster being declared, a database restore being declared, or Mindteck being unable to progress investigation while awaiting information or feedback from other vendor or relevant third parties. The Resolution Time is measured as the percentage of closed incidents that were resolved within target during the defined review period.

Upon providing a workaround for a higher severity issue, the incident will be downgraded in the helpdesk tool. The SLA compliance once the issue severity is downgraded will be measured against the downgraded Severity SLA compliance norms.

Priority Resolution Time
P1 2 hours
P2 4 hours
P3 8 hours
P4 3 Working days
P5 5 Working days

NOTE: The Response and Resolution define is generic. This can vary as per customer requirement

Guideline for selecting Priority in Incident Management

The Priority-Impact matrix below assists in selecting the right priority based on a certain Impact / Urgency combination. The matrix is to be used by support staff as a guide to consistently assign the correct priority level. It serves as an aid and does not override their knowledge and experience about the business. If they assess the priority to be higher than suggested by this matrix, they could change it accordingly.

    Impact / Complexity
    Critical High Medium Low
Urgency
Critical
High
Medium
Low
P1
P2
P3
P4
P2
P3
P4
P4
P3
P4
P5
P5
P4
P5
P5
P5

Problem Management

Will be used to fix frequently occurring repeat incidents by studying incident trends and carrying out a Root Cause Analysis (RCA). These typically give rise to implementing the solution through the change request management process and may involve upgrades/updates to the application and/or permanent fixes.

Service Requests

Service Request management is used to provide a channel for users to request and receive standard services for which a predefined approval and qualification process exists.

Response Time: Is measured as the time interval between when the request is logged in helpdesk tool and the request starts being worked on. The request is considered as being worked when the status changes to assigned. Acknowledgement Time is measured based on requests that are assigned in the defined review period.

The Resolution time: Is defined as the total of the elapsed work-in-progress time from service request open to resolve. Service requests with the status "Pending" are excluded from this SLA. Resolution Time shall be suspended (kept in Pending) in event of a disaster being declared, a database restore being declared, or support team being unable to progress investigation while awaiting information or feedback from client or relevant third parties. The Resolution Time is measured as the percentage of closed incidents that were resolved within target during the defined review period.

Response Time 100% within 1 Days
Resolution Time 100% within 3 Days

Application Availability

Application availability management is concerned with the ability of a system to respond predictably to requests which are mainly on SharePoint portal. The availability of a service relies on the health of a number of components, including the network and the Server computers that are hosting the service. Server dependencies such as network cards, power supplies, and hard disk drives can all affect the availability of service.

Change Management

Change Management is a process to provide a standardized and structured way to exchange information on changes related to the technical and to request out-of-scope services. A Request for Change (RFC) can be triggered by portal user or Proxama Directorates.

The Change Request may be initiated due to any of the following conditions:

  • Execution of a newly identified task
  • Changes in identified tasks
  • Any changes in the targeted environment
  • Any delay in availability of deliverables/approval/review by Management
  • Upgrades/Updates to the portal application

The change identified has to be documented and communicated between the project manager from Mindteck and the key contact person from the client, in the form of a written document like e-mail or a suitable template. Verbally communicated changes are expected to be followed by a written request.

Change Management Process

Change Evaluation & Approval - The change will be evaluated for its impact on the project schedule/cost estimates and communicated to portal administrators.

Change Control Board (CCB) - will be set-up to manage the changes to the configuration items. All the major change requests will be directed to the CCB to evaluate its impact on the deliverables. Based on the impact analysis, CCB will decide on revisions to the schedule and effort estimates.

The Change Control Board will:

  • Have authority to evaluate and direct change within the scope of the requirement.
  • Make decisions based on the advice of the developing team and Quality & assurance team.
  • Evaluate the change
  • Based on the impact analysis CCB will decide on the revisions to the schedule and effort estimates.
  • Approve the change
  • All changes will have to be evaluated by CCB before they are implemented in principle.
  • Only approved change requests are taken up for implementation

Emergency support for change request and feature enhancement will be taken based on the influx of the change requests. If required, additional resources will be allocated for change request handling.

Release Management

The main tasks which comprises of Release Management are broadly classified into:

  • Release planning
  • Release building and configuring
  • Release Testing and acceptance
  • Release Approval
  • Rollout planning and Scheduling
  • Post Implementation reviews
  • Sign off of Release

Release Building and Configuring

  • All the components of release shall be updated and identified with the release unit and mapped to the Request for Change.
  • Understand the dependencies between various components in a release unit provided by the development team.
  • The validated components shall be packaged and deployed in the QA environment and basic sanity testing shall be carried out.
  • Roll back shall be tested in the QA

Implementation

  • Deploy the release units as per the Roll out plan
  • Rollback a failed release
  • Carry out Root Cause Analysis

Post Implementation Reviews

  • The Root Cause Analysis for failures.
  • Action plan should be initiated for re-deployment
  • Trigger a change request, if required
  • Key learning of the release should be updated in the release record.

Capacity Management

Capacity management is an ongoing process, because no implementation remains static in terms of content and usage. The Capacity Management model includes additional steps to help us to validate and tune the initial architecture, and provides a feedback loop for re-planning and optimizing the production environment until it can support design goals with optimal choices of hardware, topology, and configuration.

Capacity Management will consider the below factors:

  • Requests Per Second (RPS): The number of requests received by a portal or server in one second. This is a common measurement of server and farm load. Note that requests are different from page loads; each page contains several components, each of which creates one or more requests when the page is loaded. Therefore, one page load creates several requests. Typically, authentication handshakes and events consuming negligible resources are not counted in RPS measurements.
  • Peak hours: The time or times during the day when load on the Server is at its maximum.
  • Peak load: The average maximum daily load on the Server, measured in RPS.
  • Load spike: Transient load peaks that fall outside normal peak hours. These may be caused by unplanned increases in user traffic, decreased farm throughput due to administrative operations, or combinations of such factors.
  • Model - is the process which decides the key solutions that is targeted to production environment to support, and establish all important metrics and parameters. The outputs of the modelling are:
    • Understand expected workload and dataset
    • Setting farm performance and reliability targets
    • Analyzing Server logs

Delivery Models

  • Offshore

    Offshore

  • Offsite

    Offsite

  • Onsite

    Onsite

  • Near-Shore

    Near-Shore

  • BOT

    BOT