The Ultimate Guide To Oracle Support
The intention of this post is simple: Oracle Support is a vast global organisation governed by processes. If you can understand those processes – and so understand the way Oracle Support works – you can maximise your return on the time you invest using their services. I originally wanted to give this post the much snappier title Know Thy Enemy but I feel that would be unfair; I honestly don’t want to paint Support in a negative light. If you want to know why I feel that way then make sure you read the Experiences With Oracle Support section at the end.
Anyone who has ever worked in a technical capacity with Oracle products has more than likely had to deal with Oracle Support at some point. This may be in the simplest form of using the customer support portal (fondly remembered as Metalink but now known as My Oracle Support or MOS), or it may be in the more involved manner of interacting with Support analysts to investigate and (hopefully) resolve a problem. For the latter, an Oracle Premier Support contract is generally required – and it is at customers with this level of support that this post is aimed.
As you will doubtless have read on other parts of this site, I used to work for Oracle Advanced Customer Services (ACS), which is a global business unit within Oracle’s Customer Services division. At the time of my employment, other business units in the same division included my friends in Oracle University (for which I have the highest respect, particularly Joel and Harald) and the ladies and gentleman of Oracle Global Customer Services – known internally as GCS, but more popularly known externally simply as Oracle Support. I spent a lot of time collaborating with GCS and so got to know both the people and the systems. The conclusion I came to is that by knowing the (publicly available) workings of this vast and mighty machine, you can increase your chances of getting the best experience when you use it.
Understanding Oracle Support
GCS is one of the largest global support organisations in the world, with thousands of employees located in almost every country you can think of. They deal with an incredible number of customers and log an insane number of service requests (SRs) every day. With that sort of organisation it is impossible to function without processes – lots of them. So with Oracle being the sort of company that likes to eat its own dogfood, back in 2009 the disparate legacy support systems from Oracle, Sun, Siebel, Peoplesoft, BEA and host of other acquisitions were merged into one single system running Siebel 8.0 and using Siebel Call Centre. This massive project was known as Project Orion – and if it sounds complex you haven’t even begun to understand the scope of it. Orion has a number of front ends, including the Internal Support Portal (ISP -only accessible by Oracle employees) and the infamous My Oracle Support (MOS) customer portal, as well as numerous connections to other systems such as SURe (the Oracle Knowledge Management database) and the BUG database. I say infamous because some bright spark decided that MOS should use Adobe Flash – culminating in a horrendous user experience which, if you were lucky to get past the 90% loading bar, gave you all sorts of headaches. Thankfully, ISP was implemented using ADF, which has now also been used to replace flash on MOS.
If you are entering the world of Oracle Support with the intention of resolving a problem or answering a question, Orion is the machine you must operate in order to reach your goal. Of course, it is not always necessary to understand a machine in order to operate it (otherwise this blog wouldn’t exist), but you have to admit that if things do not go as you expect it would be handy to know what is happening under the covers. I mentioned the word processes earlier and now I’m mentioning it again in bold. It’s important. Understand the processes and you will realise why things happen, why they sometimes don’t happen as expected and how to go about ensuring you get what you want. I’m not suggesting you break any rules though – and I’m not giving away any information that isn’t publicly available (take note Oracle lawyers – every single fact here is repeated elsewhere on the Internet or in MOS notes such as 166650.1 and 199389.1)
One more thing: please don’t forget that the Oracle Support analysts are simply cogs in this machine, bound by the same processes as you. No matter what frustrations you feel in the heat of the moment, don’t forget they are people too – and as such, any limited room to manoeuvre that they have will affected by the way you treat them.
Roles and Responsibilities
GCS consists of a number of groups of people based broadly in four different timezones: EMEA, US (East), US (West) and AUS, and JAPAC. The JAPAC zone includes the Indian Service Centre, a massive group of employees based in India – probability dictates that when you are working in this timezone you will likely be speaking to the friendly people of the ISC.
The groups can be split up into three main categories: the HUB (the people who answer the telephone when you call the support centre i.e. the ones that drink from the firehose), the GRID (the technical analysts who own and work on service requests) and Support Management, the various managers who run the organisation and get involved in escalations etc.
Some roles to know: the analyst from the GRID who works your service request is known as the Owning Analyst. They are part of a cluster, i.e. a group of analysts with related skills such as Performance, Backup and Recovery, or High Availability – and they are specialists in this particular field. If you raise an issue which becomes owned by the HA team but it turns out to be a performance problem, they will need to transfer it to the Performance team – which means someone else becoming the owner. The clusters all belong to a high level product set such as Database or Middleware – and each of these has one Duty Manager on shift at any time of the day or night. If you really really had a massive problem and you needed to alert Support Management about it, this is the person you would most likely want to speak to (although as you can imagine, sometimes they are quite busy).
There is one other role to know: the Cluster Technical Coordinator or CTC. The CTC is an analyst but they do not own or work on issues during their CTC shift; instead they go around looking at other issues and, if necessary, taking action. An example would be that if you updated your service request to ask for it to be upgraded to a severity one “critical” issue, but your owning analyst was off-shift, the CTC would be the one to pick up the update and action it (although in practice you would be far better off calling the HUB if you wanted to do that as you are guaranteed an immediate response).
Note that I am only discussing GCS here. There are other teams who might get involved in working on an issue you raise, but you will never directly communicate with them. Your GCS analyst will always be the buffer between you, BDE, Sustaining Engineering and Base Development. For more information about these teams, read my guide All About Patching.
It’s also important to understand the implications of timezones, so we’ll come back to that after we’ve described service requests in more detail.
At the heart of Orion is the concept of a service request or SR. You probably all think you know what an SR is – it is a single issue that you raise with Oracle Support as a request for them to respond to. But for you database people, try picturing it as a set of parent-child entity relationships. Every customer has one or more CSIs (Customer Service Identifiers), where an SR is raised in one of these CSIs. The CSI must be active for this to happen, which means you must have an active support contract.
The SR itself is also a parent-child relationship between the SR Header and the SR Details (or “activities”). The parent is SR Header, which has attributes such as:
- SR number (a unique key identifying your single SR)
- Problem type, summary and description
- Customer details (CSI, contact details, customer’s timezone, language, etc)
- System details (Product version, component, platform, OS version, etc)
- Global Customer Support info (owning analyst, escalation owner, etc)
- SR Properties (status, substatus, severity, escalation status, 24×7 flag, etc)
This last set of properties is particularly important to understand, so we’ll delve into it in a minute. But first let’s cover the SR Details section. This is a set of activities which take place during the lifetime of the SR, some of which you can see but many of which are viewable only by Oracle employees:
- Updates from customers
- Updates from GCS analysts, managers or other Oracle employees
- Unpublished updates from Oracle employees (invisible to customers)
- Escalation requests
- Transfers between GCS analysts
- Updates from relevant defects (bugs)
- File uploads
- Outbound emails to customers
- SR Closure details
That’s a lot of stuff. If you consider that as a customer you are only seeing a fraction of the total information in an SR, you should realise that any SR which has been open for a while is going to contain many pages of information – not all of it relevant. That’s worth considering when your SR gets transferred to a new analyst who needs to read the content before getting back to you with feedback. It’s also the reason why you should think hard about what you post in the SR – pasting pages of screen output is unlikely to assist anyone, for example. In a well-worked SR the customer would upload debug information in files and then the analyst will paste any relevant snippets in as text with relevant comments in case someone else has to take ownership later on.
It’s also the reason why you can never do any harm by restating your objectives for the SR every so often. You might think that as the customer you shouldn’t have to do this, but at the end of the day do you want to maximise your chances of success or not?
Unlike its predecessor Metalink, Orion has two status fields: status and substatus. The allowable values of substatus are dependant on the value of status, which can simply be Open or Closed. Here are the substatus values as I remember them, although of course they are subject to change:
I’m not going to describe them all – hopefully you can work out what they mean, otherwise you can read MOS note 986123.1 (which I’ve just realised shows a couple of extra substatus types to those I listed above – I told you they were subject to change). What is important to understand is that the substatus types in blue are the ones where the ball is in your court, i.e. GCS are waiting on the customer to respond.
This is a simple field containing the severity represented by a number from one to four:
- Severity 1 – Critical
- Severity 2 – Significant
- Severity 3 – Standard
- Severity 4 – Minimal
So why is this the single most misunderstood and abused property in any SR? The point of the severity attribute is to explain to Oracle the business impact that this problem is having within your organisation. It doesn’t mean the speed with which you expect a response, although of course there is a relationship between the two. What it specifically doesn’t mean is you haven’t had a response today so you want the severity increased – yet that is a common request.
Perhaps the problem comes from the fact that users do not directly choose the severity, but instead are asked a series of questions during the SR creation process which determines the level of severity assigned. It seems that not everybody understands the impact of these questions (e.g. “Is the loss of service minor?”), resulting in Sev 1 SRs which are actually minor issues and Sev 4 SRs where the user expects a reply within the hour.
Getting the severity right during SR creation matters – and I’ll explain why when I get to the section on SR Assignment.
24×7 Flag – Follow The Sun
If the SR has the 24×7 flag set then it is automatically in follow the sun mode, which means that at the end of every shift it will transfer to the next available timezone where a new analyst will pick it up and begin working on it. The sheer scale of Oracle’s support organisation makes this possible – and if you think about it, it’s a pretty amazing feat. However it is not without its drawbacks, because if you think about it there is a lot of work involved in handing over a hot issue from one analyst to another, particularly if it’s the sort of gnarly complex issue that runs and runs. If the SR is a complex one it may take the incoming analyst hours to work through it before they are able to continue the investigation. This is where your behaviour and ability to summarise the issue have a massive impact. If you can be patient and bring clarity to your communication, the analyst is going to be up to speed far quicker than if you get angry and shout “I ALREADY TOLD YOU THIS IN THE SR!”. I know that’s an incredibly patronising point to make, but you’d be surprised how emotional people get after a weekend with no sleep and conference calls every 4 hours.
Of course, if you expect Oracle to commit to working 24×7 on an issue they are probably going to want you to do the same. There’s no point in someone the other side of the globe asking questions if there is nobody available in your organisation to answer them until your working day begins. With that in mind, requests for Sev 1 are usually met with a request for your own 24×7 commitment and details of your technical and management contacts.
Note that 24×7 and Sev 1 are not synonymous. This is important to understand, because technically it is possible to have a Severity 1 SR which is not worked as follow the sun, or indeed a Severity 2 SR which is set to 24×7. These are exceptions and can only be agreed with Oracle Support management, but there are times when they happen. Note also that Oracle attempts to retain the same “service team” working a 24×7 SR, so if you had Joe Bloggs working your case in the US timezone yesterday it will likely be Joe who works it again today, providing he is on-shift. This isn’t guaranteed though, because Joe still needs to take the odd day off here and there, regardless of how big your problem is.
Sometimes there will be a need to bring your SR to the attention of Support management. The process for this is called escalation and it is described both in note 199389.1 and here in detail. One potential reason for escalating an SR is that you are dissatisfied with its progress, although there are other reasons e.g. if you have a drop-dead date and you want to ensure Support management understand the business impact. To escalate an SR you can either call the support centre or update the SR with your request. However, if you follow the latter course and your SR is not being actively worked (e.g. because the owning analyst is off-shift and it is not a 24×7 SR) there is a chance your update will be missed until the next shift begins. Once escalated your SR has a new property: the escalation owner. This is someone specific in GCS management who works with you and the analyst to develop and action plan for resolving your issue, so you might want to make sure you know their name. Note that an SR can be de-escalated too – and to be friendly you should request de-escalation if for example your critical deadline is lifted. It’s also worth being aware that you aren’t the only person that can request escalation of your SR – someone in Oracle may also choose to do so. One final thing about escalations: don’t get confused between escalations and severity increases.
SR Routing and Repatriation
When you raise a new service request it is assigned a score based on the various questions you answered. That score determines the urgency with which the SR will be treated – the lower the value, the more important it is. There are many factors which impact the score but by far the most important one is the severity. This is worth understanding, because if you create an SR as a Sev 4 and then immediately update it to say that you want it to be a Sev 2, you have just confused the system. What is likely to happen is that it will be sitting at the bottom of some analyst’s list of SRs (because it is a Sev 4) and they won’t see your update for some time. There is a solution to this: answer the questions properly in the first place!
It’s also important to understand the timezone of your SR. Here’s why:
10:00 Customer Jimbob’s Oracle RAC database takes a dive but comes back up again
10:01 Jimbob’s management team are on his case to find out why this happened
10:02 Jimbob begins gathering the various traces and diagnostics required to get an RCA
16:30 Tired and hungry, Jimbob finally finishes gathering all the data and logs a Sev 2 SR
16:35 Since the EMEA shift just finished the SR is routed to Mary in the US
16:55 Mary starts downloading the many and various log and trace files
17:30 Jimbob decides to go to the pub for a well-earned beer followed by going home to bed
20:15 Mary realises that the alert log is missing and so updates the SR to request it
20:16 The SR is now in a status of Waiting on Customer
But Jimbob is either drunk or unconscious so nothing is going to happen until the following morning.
10:30 After five cups of coffee, Jimbob reads the update and realises that the alert log was indeed missing
09:45 Jimbob uploads the alert log – SR is in status Review Update
Well guess what? Mary’s shift doesn’t begin for another 8 hours. That SR is going to sit in the same status until Mary comes on shift and continues working on it…
17:20 Mary puts the SR in status Work In Progress and starts looking at the alert log and other files
19:35 Mary asks Jimbob if he can supply OS Watcher details for the time around the incident
19:36 SR is back in Waiting on Customer
More time passes.
09:10 Jimbob begins uploading OSW details.
11:00 Jimbob’s management demand to know why after two whole days no progress has been made on the RCA
Are you getting the idea here? That timezone difference is going to make the whole thing very prolonged – and it’s not GCS’s fault. There is a solution though. As the SR owner, if your SR is owned in a different timezone to you, you can request to have your SR repatriated. This means that it will be sent back into the furnace of SRs inside Orion and routed to a new owner in your timezone. This often makes things a lot easier. Although don’t forget if you ask for repatriation at the end of your working day you probably won’t get a response until the following morning when your local timezone’s shift begins. Also keep in mind that if you ask for repatriation in the SR but your owning analyst isn’t on-shift they won’t see it until they start work, possibly causing you to miss an entire day. For this reason it is better to request repatriation via calling the HUB.
Tips for Working Effectively with Oracle Support
- Read the note Working Effectively With Oracle Support (166650.1)
- Describe your problem simply and then every so often re-state it
- Ensure you include the business impact and details of any deadlines or drop-dead dates
- Don’t escalate unless necessary and de-escalate when possible
- Get the questions right during the SR Creation phase
- Make sure you know what timezone your SR is being worked in – repatriate if necessary
- If you need an immediate response e.g. escalation or severity increase, call rather than update
- Make every effort you can not to overload the text of the SR with clutter
- Be nice!
Experiences With Oracle Support
Believe it or not I have known people have a frustrating experience or two with Oracle Support. You may have had a service request that ran and ran without coming to a conclusion, you may have been asked to upload lots of information (RDA anyone?) only for the analyst to then ask you what version of Oracle you are running, or you may have sat up all night detailing your issues on a conference call only for the analyst to finish their shift and the new incoming analyst to ask you if you could explain your problem again from the beginning. I have known people who felt that they were being asked for additional information purely to keep the status of their problem as “Waiting on Customer”. I have known people whose problems were pinged backwards and forwards between different support teams without anyone appearing to take ownership. And I have even known people who say that their service requests were closed before they felt it was appropriate – and then subsequent attempts to reopen the request went unheeded.
On a personal note I have spent a lot of time interacting with GCS, originally as a customer but then for a number of years as an Advanced Services Engineer for Oracle ACS. There are a number of differences between ACS and GCS, but one of the most obvious is that ACS engineers tend to be assigned to specific customers and based on those customers’ sites, while GCS analysts are generally office or home-based and work shift patterns dealing with whatever problems come their way, regardless of the customer. At the end of a shift, the GCS engineer hands their active service requests over to the next timezone and then logs off. If you are an ACS engineer on a customer site and all hell breaks loose, you are unlikely to be going home any time soon. Particularly if you are friendly on a personal level with the customer – which is pretty likely if you work on their site full time. So the outcome is that the ACS engineer is often stuck at the sharp end of the problem, taking the full heat from their customer (where they work fulltime and any fallout will have consequences) whilst the GCS engineer is at the other end of the telephone and dealing with potentially many other problems at the same time. You might think that the ACS engineer is screwed. But it turns out that the boys and girls of GCS are lifesavers, because I have lost count of the number of times I have had my backside saved by a GCS analyst – often one who should already have gone home but stayed late to help a colleague (and this isn’t just restricted to GCS either, the same applies to the folks in BDE and Sustaining Engineering). I have known GCS analysts come on to calls with angry, sleep-deprived customers and turn them into cooing pussycats within minutes. I am forever grateful to these people for their calmness under pressure and often in the face of extreme ire – particularly the folks in Australia (like Neil and Callum) who, for timezone-related reasons, are the first shift online after a weekend and therefore inherit more than their fair share of fireballs on a Monday morning.
So to summarise: be nice to them. One day you might need them.