What is the PIP reliability test?

Even when you can technically do a PIP activity, the descriptor for 'needs help' or 'cannot do' applies if you can't do it reliably. The four criteria are: safely (without significant risk of harm), repeatedly (as often as required), in a reasonable time (no more than twice as long), and to an acceptable standard.

Where is the reliability test in the PIP regulations?

Regulation 4(2A) of The Social Security (Personal Independence Payment) Regulations 2013. It applies to every activity and every descriptor; the assessor is required to apply it but often doesn't.

What is 'reasonable time' in the reliability test?

No more than twice as long as someone without your condition would take. If a 20-minute task takes you 60 minutes, that's not reasonable time, and the descriptor for 'needs help' applies even where the activity is technically possible.

Does post-exertional malaise (PEM) count for the reliability test?

Yes - PEM means you cannot do the activity repeatedly. Even if you complete it once, the next-day inability to repeat it means you fail the 'repeatedly' criterion. This is one of the strongest arguments for ME/CFS, long covid, and fibromyalgia claims.

The PIP reliability test: safely, repeatedly, in a reasonable time, to an acceptable standard

The rule that doesn’t appear on the form, but decides whether your answers actually score.

Key Takeaways

The reliability test is set in regulation 4(2A) of the PIP Regulations 2013 and confirmed by MA v SSWP [2016] UKUT 0136 (AAC).

To count as being able to do an activity, you must be able to do it safely, repeatedly, in a reasonable time, and to an acceptable standard.

Reasonable time is defined: no more than twice as long as someone without the condition would take.

Failing any one of the four criteria is enough. The descriptor for “can do this unaided” no longer applies; a higher descriptor does.

Post-exertional malaise (PEM) is the canonical “fails on repeatedly” case - central to ME/CFS, long covid, and fibromyalgia claims.

There’s a rule in the PIP scoring system that doesn’t appear anywhere on the form, isn’t explained in the guidance booklet that comes with it, and is rarely raised by assessors during a phone call or face-to-face appointment. But it’s written into the regulations, has been confirmed by the courts, and decides whether the descriptors you think apply to you actually do.

It’s called the reliability test. It says that for a descriptor to count as something you can do, you must be able to do it:

Safely
Repeatedly
In a reasonable time
To an acceptable standard

Failing any one of these means you can’t do the activity, in the legal sense, even if you can technically complete it. And that means a higher descriptor (one for “needs help” or “cannot do”) can apply to you.

This is one of the most under-used parts of the PIP system. Many claimants describe themselves as able to do things they technically can complete, without ever stopping to think about whether they can do them safely, repeatedly, in reasonable time, and to an acceptable standard. When you understand the test, your honest description of yourself often shifts: not by exaggeration, but by precision.

Where does the reliability test come from?

The reliability test is set out in regulation 4(2A) of the Social Security (Personal Independence Payment) Regulations 2013 (retrieved May 2026). The regulation says, in plain terms, that a person is only assessed as satisfying a descriptor if they can do so reliably, and then defines the four criteria above.

Each term has a specific legal definition:

Safely: in a manner unlikely to cause harm to the claimant or another person, either during or after completing the activity.
Repeatedly: as often as the activity being assessed is reasonably required to be completed.
Reasonable time period: no more than twice as long as the maximum period that a person without the relevant condition would normally take.
Acceptable standard: this isn’t separately defined in the regulations, but is generally interpreted as a standard most people would consider adequate.

The test was strengthened by case law, particularly MA v Secretary of State for Work and Pensions [2016] UKUT 0136 (AAC), which confirmed how it should be applied. It is now treated by the Department for Work and Pensions and tribunals as integral to scoring every descriptor, not just an optional consideration.

The practical effect is this: whenever you describe being able to do something on your PIP form, you’re implicitly claiming you can do it reliably. If you can’t, you should say so.

Safely: can you do it without significant risk of harm?

The first criterion. To count as being able to do an activity, you must be able to do it without significant risk of harm, either during the activity or as a result of it.

This isn’t about theoretical risk: everyone takes some risk doing anything. It’s about whether your specific condition makes the risk meaningfully greater than it would be for someone without your condition.

A claimant with vertigo who can technically shower but has fallen twice while doing so is not washing safely. A claimant with epilepsy who could in theory cook on a hob but might lose consciousness mid-cook is not preparing food safely. A claimant with severe ADHD who can drive but has had three near-collisions because of inattention is not undertaking journeys safely. A claimant with osteoporosis who can walk to the bathroom but whose risk of fracture from falling has been described as significant is not moving around safely without aids or supervision.

The test is whether harm is likely, not certain. The regulations use the phrase “unlikely to cause harm,” so the question is whether, on the balance of probabilities, the activity puts you or others at meaningful risk.

If safety concerns mean you avoid the activity, only do it with someone present, or only do it after taking precautions someone without your condition wouldn’t need, that’s the test telling you a higher descriptor applies.

Repeatedly: can you do it as often as required?

This is the criterion that protects claimants with fatigue, post-exertional malaise, pain that escalates with activity, and any condition where doing something once means you can’t do it again.

The regulations define “repeatedly” as as often as the activity is reasonably required to be completed. That means: how often does this activity usually need to happen?

Cooking a simple meal: most people do this once a day, often twice. If you can cook a meal but doing so means you can’t cook again that day, you can’t do it repeatedly.
Washing: most people wash daily. If you can shower once but the effort of doing so puts you in bed for the rest of the day, you can’t wash repeatedly.
Walking 50 metres: most people walk this distance multiple times a day, often dozens of times. If you can walk 50m once but doing so means you cannot walk again for hours, you can’t do it repeatedly.
Dressing: most people dress once a day, sometimes twice. If you can dress but doing so leaves you exhausted to the point you can’t undress again that night without help, the test is failing.

The pattern that matters most here is post-exertional malaise: common in ME/CFS, long covid, fibromyalgia, severe MS, and several other conditions. PEM is the medically recognised pattern where physical or cognitive exertion produces a delayed crash that can last hours, days, or longer. Someone with PEM might be capable of doing an activity once, but not capable of doing it as often as life requires.

If this describes you, name it. Use the term “post-exertional malaise” or “I crash after activity” explicitly. Describe what the crash looks like and how long it lasts. Don’t let the assessor assume that “she can walk 50m” means “she can walk 50m repeatedly throughout her day.”

In a reasonable time: no more than twice as long?

The regulations are specific here: a reasonable time means no more than twice as long as the maximum a person without the condition would typically take.

This is a specific, usable benchmark. It means you can apply numbers to your form answers.

If preparing a simple meal would take a person without your condition 30 minutes, taking longer than 60 minutes means you can’t do it in a reasonable time.
If washing and dressing typically takes a person 20 minutes, taking longer than 40 minutes means you can’t do it in a reasonable time.
If a short familiar journey would take 10 minutes for someone without your condition, taking longer than 20 minutes means you can’t follow it in reasonable time.

These numbers will vary by activity and by person, but the doubling test is the formal threshold. If you take significantly longer than someone else would because of pain, fatigue, executive dysfunction, dexterity issues, or anything else, you can fail the reliability test on that basis alone, even if you can complete the activity to standard, even if it’s safe, even if you can repeat it.

Many claimants don’t realise this is countable. They think “yes I can dress myself” without noticing it takes them an hour. The form’s question is implicitly “can you do this reliably?” - and an hour to dress, when most people take fifteen minutes, is failure of the reasonable-time limb.

If your answer to a question is “yes, but it takes me much longer,” you should write that out, with rough timings if you can. “I can dress myself, but it usually takes me 45–60 minutes because of joint stiffness in my hands and the rest periods I need.”

To an acceptable standard: would most people call this adequate?

The fourth criterion is the most subjective, but still has practical application. It asks whether the activity is being completed to a standard most people would consider adequate.

A claimant with severe depression who washes by splashing water on their face every few days is technically washing, but not to an acceptable standard. A claimant with autism whose teeth-brushing takes 30 seconds because of sensory aversion is technically managing oral care, but not to an acceptable standard. A claimant with fibro fog who can prepare food but undercooks it, leaves the hob on, or eats raw because cooking is too cognitively demanding is technically preparing food, but not to an acceptable standard.

The “acceptable standard” criterion is most relevant for:

Hygiene: depression, executive dysfunction, sensory issues
Nutrition: eating but not eating adequately, skipping meals, surviving on snacks because cooking is too hard
Dressing: wearing the same clothes for days because changing is too difficult
Communication: speaking but not making yourself understood, or understanding but missing key information
Moving around: walking but with such poor gait or so painfully that the activity is not being done to a standard most people would recognise

If you’re meeting the bare minimum but not what most people would consider adequate, you should describe it that way on the form.

How do the four reliability criteria interact?

The four criteria are independent. Failing any one of them is enough to fail the test as a whole.

You don’t need to fail all four. You don’t need to fail two. Failing one means the descriptor for “can do this activity unaided” doesn’t apply to you, and a higher descriptor - one that involves needing help, supervision, prompting, or being unable to do the activity at all - likely does.

Worked example: rheumatoid arthritis claimant, preparing food

A claimant with severe rheumatoid arthritis describes preparing food. She can:

Stand at the hob long enough to cook ✓ Safely
Cook one meal, but is too exhausted afterwards to cook a second the same day ✗ Repeatedly
Cook a simple meal in about 45 minutes (twice the typical 20) ~ Borderline reasonable time
Produce edible, properly cooked food ✓ Acceptable standard

She fails on “repeatedly” alone. The descriptor for being able to prepare food unaided does not apply to her, even though she can complete the act once. A higher descriptor - likely “needs supervision or assistance” or “cannot prepare and cook food” depending on her overall pattern - is the right fit.

The form question doesn’t ask “can you cook a meal once today?” It asks how preparing food is for you, generally, reliably, across your week. The reliability test is what makes the difference.

How do you apply the test to your form answers?

You don’t need to write out all four criteria for every answer. You do need to apply them silently before deciding what to write.

A workable approach:

Read each activity. Picture yourself doing the task as it usually needs to happen.
Walk through the four limbs.
- Can I do this safely?
- Can I do it as often as it needs to happen, without crashing or becoming unable to repeat it?
- Can I do it in roughly twice the time someone without my condition would take?
- Can I do it to a standard most people would consider adequate?
Wherever you fail one of these, write that into your answer. Use specific language: “I can prepare a simple meal once, but doing so leaves me unable to cook again that day.” Or: “I can dress myself but it takes me about an hour because of joint pain in my hands.”
Don’t bury it. If reliability is the reason a higher descriptor applies to you, name it explicitly. The assessor reading your form is looking for descriptor-relevant language. Vague phrasing gets read past.

The phrases that work include:

“I can do this once but cannot repeat it because of post-exertional fatigue.”
“It takes me significantly longer than someone without my condition - usually around X minutes.”
“I can technically do this but not to a standard I’d consider adequate - I often skip steps.”
“It is not safe for me to do this without supervision because of [specific risk].”

Each of these is a one-sentence reliability-test statement. Sprinkle them through your answers wherever they apply.

What if your assessment ignores the reliability test?

If you go to assessment and the assessor records that you “can do” an activity without exploring whether you can do it reliably, you can challenge this in mandatory reconsideration and at tribunal.

The MA v Secretary of State case made the reliability test mandatory in PIP assessments. An assessment that doesn’t apply it is procedurally flawed, and identifying this flaw - descriptor by descriptor - is one of the most effective grounds for a successful Mandatory Reconsideration or appeal.

If you’re at the post-decision stage, see our guide to writing a Mandatory Reconsideration request and our overview of PIP tribunal appeals.

Where does the reliability test fit with other PIP rules?

The reliability test interlocks with two other rules:

The descriptor system itself - see our foundational guide to PIP descriptors.
The 50% rule - even when reliability fails, that failure has to apply on most days for the worse descriptor to score. See our 50% of the time rule explained.

Reliability tells you whether a descriptor applies at all. The 50% rule tells you whether it applies often enough to count. Together, they’re the two filters every PIP answer passes through, usually invisibly to the claimant.

Naming both, gently and accurately, in your form answers is what closes the gap between honest description and descriptor-aligned language.

Free help is available

The reliability test is technical, and the language that captures it well takes practice. Free help with PIP claims is available from Citizens Advice, Scope, Disability Rights UK, your local welfare rights service, and several condition-specific charities. We strongly recommend using these services, particularly for a first claim or for an appeal.

If you’d like to draft your PIP answers with the reliability test built in, our tool walks you through each question and offers optional AI rewriting that translates your honest answers into descriptor-aligned language without changing what you said. You can start here.

This page describes the PIP reliability test as set out in regulation 4(2A) of the Social Security (Personal Independence Payment) Regulations 2013 (retrieved May 2026), as interpreted in MA v Secretary of State for Work and Pensions [2016] UKUT 0136 (AAC) and subsequent case law. This page is general information, not legal or benefits advice; your specific circumstances will determine how the test applies to you.