Surface pod Events (FailedScheduling, etc.) on waitForPodPhases timeout

## Problem

When a workflow pod fails to reach `Running` for a reason that lives in K8s **Events** rather than on the pod object itself, the hook surfaces only:

```
##[error]Error: pod failed to come online with error: <generic timeout>
##[error]Executing the custom container implementation failed. Please contact your self hosted runner administrator.
```

The most common cases:

| Event | What the user actually needs to see |
|---|---|
| `FailedScheduling` | `0/14 nodes are available: 13 Too many pods, 1 node was unschedulable.` |
| `FailedScheduling` | `Insufficient cpu` / `Insufficient memory` |
| `FailedScheduling` | `node(s) had untolerated taint …` |
| `Failed` (kubelet) | `Error: ErrImagePullBackOff` predecessors |

The ephemeral workflow pod is typically pruned before an operator can `kubectl describe` it, so the diagnostic is lost.

## Scope vs other PRs

- [#336](https://github.com/actions/runner-container-hooks/pull/336) handles `containerStatuses[].state.waiting.{reason,message}` on the pod object — covers `ImagePullBackOff`, missing tags, etc. **Doesn't read Events.**
- [#341](https://github.com/actions/runner-container-hooks/pull/341) (merged) fixes the `{}` empty-message problem in 4 throw sites. **Doesn't read Events.**
- [#364](https://github.com/actions/runner-container-hooks/pull/364) (open) fixes the circular-JSON crash. **Doesn't read Events.**

This issue is the missing third piece: read pod **Events** when the pod object alone doesn't explain the failure.

## Proposed fix (~15 LOC)

In `k8s/index.ts` `waitForPodPhases`, in the catch / timeout path, before throwing, fetch the most recent `Warning` events for the pod and append them to the error message — best-effort, swallow any API failure:

```ts
let extra = ''
try {
  const events = await k8sApi.listNamespacedEvent({
    namespace: namespace(),
    fieldSelector: `involvedObject.name=\${podName},type=Warning`
  })
  const warnings = (events.items ?? [])
    .sort((a, b) => +new Date(b.lastTimestamp ?? b.eventTime!) -
                    +new Date(a.lastTimestamp ?? a.eventTime!))
    .slice(0, 3)
    .map(e => \`[\${e.reason}] \${e.message}\`)
  if (warnings.length) extra = \`; events: \${warnings.join('; ')}\`
} catch { /* diagnostic best-effort */ }
throw new Error(
  \`Pod \${podName} is unhealthy with phase status \${phase}: \${formatError(error)}\${extra}\`
)
```

Unit test: mock `listNamespacedEvent` to return a `FailedScheduling: 0/3 nodes are available: 3 Too many pods.` warning; assert that substring shows up in the thrown error.

## Happy to open the PR

If the proposal looks right, I can open a PR mirroring the structure of #336 + #364 — small, tightly scoped, with a unit test.

Event	What the user actually needs to see
`FailedScheduling`	`0/14 nodes are available: 13 Too many pods, 1 node was unschedulable.`
`FailedScheduling`	`Insufficient cpu` / `Insufficient memory`
`FailedScheduling`	`node(s) had untolerated taint …`
`Failed` (kubelet)	`Error: ErrImagePullBackOff` predecessors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surface pod Events (FailedScheduling, etc.) on waitForPodPhases timeout #366

Problem

Scope vs other PRs

Proposed fix (~15 LOC)

Happy to open the PR

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Surface pod Events (FailedScheduling, etc.) on waitForPodPhases timeout #366

Description

Problem

Scope vs other PRs

Proposed fix (~15 LOC)

Happy to open the PR

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions