<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Oblique</title><description>Product news and updates from the Oblique team</description><link>https://oblique.security/</link><item><title>Generating an MCP server in Go</title><link>https://oblique.security/blog/mcp/</link><guid isPermaLink="true">https://oblique.security/blog/mcp/</guid><description>MCP is table stakes for any service, and Go&apos;s official SDK makes support trivial. The hard part is staying current, so we generate ours from Protobuf.</description><pubDate>Thu, 14 May 2026 17:45:00 GMT</pubDate><content:encoded>&lt;p&gt;MCP is table stakes for products these days. Everything’s AI, and agents will query your data one way or another.&lt;/p&gt;&lt;p&gt;As we’ve built Oblique, a core principle is not having internal APIs. Anything a user is able to do in the UI should be possible through Terraform or REST - and now, MCP. We generate tons of bindings from our &lt;a href=&quot;https://oblique.security/blog/type-safe-frontend-apis/&quot;&gt;gRPC interface&lt;/a&gt; with the explicit intent that as new features are added, they make their way to all of our integration points.&lt;/p&gt;&lt;h2&gt;Wait, isn’t MCP dead?&lt;/h2&gt;&lt;p&gt;There’s an infinite number of ways to expose capabilities to agents. Command line tools, &lt;a href=&quot;https://posthog.com/newsletter/agent-first-product-engineering#2-meet-agents-at-their-level-of-abstraction&quot;&gt;SQL interfaces&lt;/a&gt;, &lt;a href=&quot;https://blog.cloudflare.com/code-mode/&quot;&gt;transpiling to TypeScript&lt;/a&gt;. If there exists some means of querying or modifying data, someone has experimented with it as an MCP alternative.&lt;/p&gt;&lt;p&gt;Regardless of the tech, the model needs to understand the capabilities exposed to it and how to exercise them. For a command line tool, this means keeping your “--help” output accurate. For MCP, this is tool descriptions and input schemas. Fundamentally if you’re providing primitives to an agent, you’re in the business of keeping the documentation for those interfaces up to date and debugging when things go wrong.&lt;/p&gt;&lt;p&gt;Where MCP is irreplaceable are the increasing options for autonomous agents. Your Claude Code Routines and Codex Automations of the world. Sure, you may choose to implement raw tool calls through a custom harness for your internal workflows, or shell out to a binary on your laptop. But if you want to expose data to a customer workflow, you have to speak MCP. It’s become the lowest common denominator for working with agents.&lt;/p&gt;&lt;h2&gt;Tool descriptions and schemas&lt;/h2&gt;&lt;p&gt;Descriptions and schemas are critical for tool discovery and informing the model how to construct tool calls. A majority of the issues we hit boiled down to incorrect plumbing of comments, or typos in our API docs. For example, we had a bug where enum fields were commented but their values weren’t.&lt;/p&gt;&lt;p&gt;As we covered in our &lt;a href=&quot;https://oblique.security/blog/type-safe-frontend-apis/&quot;&gt;previous post&lt;/a&gt;, we generate as much of our server tooling as possible from Protobuf. In this case, we leveraged Go’s &lt;a href=&quot;https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect&quot;&gt;protoreflect&lt;/a&gt; package to produce tool definitions for the &lt;a href=&quot;https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp&quot;&gt;Go MCP SDK&lt;/a&gt;. This means we consume a proto file:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;service Oblique {
  // Fetch a user by name, or using the alias &quot;users/me&quot; to return the currently
  // authenticated user.
  rpc GetUser(GetUserRequest) returns (User) {
    option (google.api.http) = {get: &quot;/api/v1/{name=users/*}&quot;};
  }
}
message GetUserRequest {
  // Name of the user of the format &quot;users/{id}&quot;. The alias &quot;users/me&quot; is also
  // supported and resolves to the currently authenticated user.
  string name = 1 [
    (google.api.field_behavior) = REQUIRED
  ];
}
// A user represents a human user in the directory.
message User {
  // Format: `users/{id}`
  string name = 1 [(google.api.field_behavior) = IDENTIFIER];
  // Primary display name of the user.
  string display_name = 2 [(google.api.field_behavior) = REQUIRED];
  // Primary email of the user.
  string email = 3 [(google.api.field_behavior) = REQUIRED];
  // ...
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;…and spit out a tool description:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
	&quot;annotations&quot;: {
		&quot;readOnlyHint&quot;: true,
		&quot;title&quot;: &quot;GetUser&quot;
	},
	&quot;name&quot;: &quot;GetUser&quot;,
	&quot;description&quot;: &quot;Fetch a user by name, or using the alias \&quot;users/me\&quot; to return the currently authenticated user.&quot;,
	&quot;inputSchema&quot;: {
		&quot;type&quot;: &quot;object&quot;,
		&quot;properties&quot;: {
			&quot;name&quot;: {
				&quot;type&quot;: &quot;string&quot;,
				&quot;description&quot;: &quot;Name of the user of the format \&quot;users/{id}\&quot;. The alias \&quot;users/me\&quot; is also supported and resolves to the currently authenticated user.&quot;
			}
		},
		&quot;required&quot;: [
			&quot;name&quot;
		]
	},
	&quot;outputSchema&quot;: {
		&quot;type&quot;: &quot;object&quot;,
		&quot;description&quot;: &quot;A user represents a human user in the directory&quot;,
		&quot;properties&quot;: {
			&quot;name&quot;: {
				&quot;type&quot;: &quot;string&quot;,
				&quot;description&quot;: &quot;Format: `users/{id}`&quot;
			},
			&quot;displayName&quot;: {
				&quot;type&quot;: &quot;string&quot;,
				&quot;description&quot;: &quot;Primary display name of the user.&quot;
			},
			&quot;email&quot;: {
				&quot;type&quot;: &quot;string&quot;,
				&quot;description&quot;: &quot;Primary email of the user.&quot;
			}
		},
		&quot;required&quot;: [
			&quot;name&quot;,
			&quot;displayName&quot;,
			&quot;email&quot;
		]
	}
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We’ve consistently found that human, handwritten comments are tremendously powerful at directing models and outperform anything generated. Both in the context of an AGENTS.md file or tool descriptions. This makes some intuitive sense, since if an agent can generate a comment, it can likely derive that information anyway.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The challenge here is accessing source code comments which &lt;a href=&quot;https://github.com/golang/protobuf/issues/1134&quot;&gt;aren’t available&lt;/a&gt; in files produced by the Go proto plugin. To work around this, our generation scripts output a &lt;a href=&quot;https://buf.build/docs/reference/descriptors/&quot;&gt;descriptor file&lt;/a&gt; and embed it in our Go server. That file is then parsed with the &lt;a href=&quot;https://pkg.go.dev/google.golang.org/protobuf/reflect/protodesc&quot;&gt;protodesc&lt;/a&gt; package to access comments as we walk the message descriptors:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;//go:embed descriptor_set.binpb
var descriptor []byte

var descriptorFiles *protoregistry.Files

func init() {
	opts := protodesc.FileOptions{AllowUnresolvable: true}
	set := &amp;descriptorpb.FileDescriptorSet{}
	if err := proto.Unmarshal(descriptor, set); err != nil {
		panic(&quot;parsing descriptor: &quot; + err.Error())
	}
	files, err := opts.NewFiles(set)
	if err != nil {
		panic(&quot;resolving files: &quot; + err.Error())
	}
	descriptorFiles = files
}

type direction int

const (
	input  direction = 0
	output direction = 1
)

func newTool(md protoreflect.MethodDescriptor) (*mcp.Tool, error) {
	desc, err := descriptorFiles.FindDescriptorByName(fullName)
	if err != nil {
		return &quot;&quot;, fmt.Errorf(&quot;finding descriptor by name: %s: %v&quot;, fullName, err)
	}
	description := desc.ParentFile().SourceLocations().ByDescriptor(desc).LeadingComments

	isDestructive := strings.HasPrefix(string(md.Name()), &quot;Delete&quot;)
	readonlyPrefixes := []string{&quot;Get&quot;, &quot;List&quot;, &quot;BatchGet&quot;}
	isReadOnly := false
	for _, p := range readonlyPrefixes {
		if isReadOnly = strings.HasPrefix(string(md.Name()), p); isReadOnly {
			break
		}
	}

	inputSchema, err := schemaForMessage(md.Input(), input)
	if err != nil {
		return nil, fmt.Errorf(&quot;generating schema for input: %v&quot;, err)
	}
	outputSchema, err := schemaForMessage(md.Output(), output)
	if err != nil {
		return nil, fmt.Errorf(&quot;generating schema for output: %v&quot;, err)
	}

	toolName := renameTools()
	return &amp;mcp.Tool{
		Name:         string(md.Name()),
		Description:  description,
		InputSchema:  inputSchema,
		OutputSchema: outputSchema,
		Annotations: &amp;mcp.ToolAnnotations{
			Title:           toolName,
			DestructiveHint: &amp;isDestructive,
			ReadOnlyHint:    isReadOnly,
		},
	}, nil
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Our schema generation also filters fields based on the context (the code was a bit big for this post). If a message is used for tool input, fields with an OUTPUT_ONLY annotation are ignored by the generator. We also remove metadata fields from our API objects that aren’t critical for the model. Consider the following result from our REST API (&lt;a href=&quot;https://platform.openai.com/tokenizer&quot;&gt;185 tokens&lt;/a&gt;):&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
    &quot;name&quot;: &quot;users/ekqzuevxt544jl9i&quot;,
    &quot;createTime&quot;: &quot;2026-05-12T20:51:28.504018Z&quot;,
    &quot;updateTime&quot;: &quot;2026-05-12T20:51:28.512136Z&quot;,
    &quot;deleteTime&quot;: null,
    &quot;displayName&quot;: &quot;Eric Chiang&quot;,
    &quot;email&quot;: &quot;eric@rhombic.dev&quot;,
    &quot;secondaryEmails&quot;: [
        &quot;eric@obliquesecurity.com&quot;,
        &quot;eric@oblique.security&quot;
    ],
    &quot;title&quot;: &quot;Software Engineer&quot;,
    &quot;manager&quot;: &quot;users/sx1qezek79nqyrb0&quot;,
    &quot;pictureUri&quot;: &quot;https://avatars.githubusercontent.com/u/2342749&quot;,
    &quot;directReportCount&quot;: 2,
    &quot;totalReportCount&quot;: 2
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;On output we use &lt;a href=&quot;https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect#Message&quot;&gt;protoreflect&lt;/a&gt; to clear common fields (“createTime”, “updateTime”), as well as message-specific fields that aren’t as relevant to an MCP client. In this case our user object consumes half as many tokens (97) as its API equivalent:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
    &quot;name&quot;: &quot;users/ekqzuevxt544jl9i&quot;,
    &quot;displayName&quot;: &quot;Eric Chiang&quot;,
    &quot;email&quot;: &quot;eric@rhombic.dev&quot;,
    &quot;secondaryEmails&quot;: [
        &quot;eric@obliquesecurity.com&quot;,
        &quot;eric@oblique.security&quot;
    ],
    &quot;title&quot;: &quot;Software Engineer&quot;,
    &quot;manager&quot;: &quot;users/sx1qezek79nqyrb0&quot;
}&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Debugging performance&lt;/h2&gt;&lt;p&gt;The most common bug report from our initial test was &lt;em&gt;“the model is confused by X.”&lt;/em&gt;&lt;/p&gt;&lt;p&gt;If you’re at a big company, you have an entire team dedicated to evaluations and prompt engineering. For this particular feature, it was just me debugging where the agent was getting tripped up, and trying to ensure new changes didn’t degrade previous performance.&lt;/p&gt;&lt;p&gt;One of the early features we implemented was a developer mode running our MCP server over &lt;a href=&quot;https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp#StdioTransport&quot;&gt;stdio&lt;/a&gt;. This paired with simulated data allowed us to run prompts in a sandbox. Our first iteration was to call Claude Code’s &lt;a href=&quot;https://code.claude.com/docs/en/headless&quot;&gt;programmatic mode&lt;/a&gt; with an MCP configuration and stream the results:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;SERVER_BIN=&quot;$PWD/bin/oblique-server&quot;
go build -o &quot;$SERVER_BIN&quot; ./cmd/oblique-server
SYSTEM_PROMPT=&quot;$PWD/mcp/eval-systemprompt.txt&quot;
TMPDIR=&quot;$( mktemp -d )&quot;
MCP_CONFIG=&quot;$TMPDIR/mcp_config.json&quot;
cat &gt;&quot;$MCP_CONFIG&quot; &lt;&lt;EOF
{
  &quot;mcpServers&quot;: {
    &quot;oblique&quot;: {
      &quot;command&quot;: &quot;$SERVER_BIN&quot;,
      &quot;args&quot;: [&quot;--seed-fixture=oblique&quot;,&quot;--insecure-mcp-stdio&quot;]
    }
  }
}
EOF
cd &quot;$TMPDIR&quot;
claude -p &quot;$PROMPT&quot; \
    --mcp-config &quot;$MCP_CONFIG&quot; \
    --append-system-prompt-file &quot;$SYSTEM_PROMPT&quot; \
    --strict-mcp-config \
    --mcp-debug \
    --permission-mode dontAsk \
    --allowedTools mcp__oblique \
    --verbose \
    --output-format stream-json&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Claude Code’s &lt;a href=&quot;https://code.claude.com/docs/en/headless#stream-responses&quot;&gt;streaming JSON format&lt;/a&gt; logs a ton of internals about the harness to help understand what’s going on. We can see the results of tool search, to help determine if a tool’s description needs improvement:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
  &quot;type&quot;: &quot;user&quot;,
  &quot;tool_use_result&quot;: {
    &quot;matches&quot;: [
      &quot;mcp__oblique__GetUser&quot;,
      &quot;mcp__oblique__SearchUsers&quot;,
      &quot;mcp__oblique__SearchTeamMembers&quot;,
      &quot;mcp__oblique__SearchTeams&quot;
    ],
    &quot;query&quot;: &quot;select:mcp__oblique__GetUser,mcp__oblique__SearchUsers,mcp__oblique__SearchTeamMembers,mcp__oblique__SearchTeams&quot;
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Or inspect successful and unsuccessful tool calls, which can point to deficiencies in our schema comments:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
  &quot;type&quot;: &quot;assistant&quot;,
  &quot;message&quot;: {
    &quot;type&quot;: &quot;message&quot;,
    &quot;role&quot;: &quot;assistant&quot;,
    &quot;content&quot;: [
      {
        &quot;type&quot;: &quot;tool_use&quot;,
        &quot;name&quot;: &quot;mcp__oblique__SearchUsers&quot;,
        &quot;input&quot;: {
          &quot;filter&quot;: &quot;email=\&quot;eric@oblique.security\&quot;&quot;
        }
      }
    ]
  }
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We took that initial bash script and wrote a ~300-line Go program that consumes a library of prompts, calls Claude Code in parallel for each one, and parses the output using Go’s &lt;a href=&quot;https://pkg.go.dev/encoding/json#Decoder&quot;&gt;streaming JSON&lt;/a&gt; support. For every prompt, the program produces a report including the set of tools that were loaded and the tool calls:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;- prompt: What teams am I on?
  tool_matches:
    - mcp__oblique__GetUserProfile
    - mcp__oblique__SearchTeams
    - mcp__oblique__SearchTeamMembers
    - mcp__oblique__BatchGetTeams
  tool_calls:
    - name: mcp__oblique__SearchTeamMembers
      input: &apos;{&quot;parent&quot;:&quot;teams/-&quot;,&quot;filter&quot;:&quot;member=\&quot;users/me\&quot;&quot;}&apos;
    - name: mcp__oblique__BatchGetTeams
      input: &apos;{&quot;names&quot;:[&quot;teams/engineering&quot;,&quot;teams/spanish-conversation-group&quot;,&quot;teams/puzzle-box&quot;]}&apos;
  tool_errors: []
  result: |-
    You&apos;re on 3 teams:

    - **Engineering**
    - **Spanish conversation group** — ¡Hola! ¿Cómo estás? ¡Ven a charlar con nosotros los viernes por la mañana! All levels welcome
    - **Puzzle Box** — Make better puzzles&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This view was invaluable to see if changes were steering the model in the right direction. Any time we get a bug report, we now add a prompt to our evals that attempts to replicate it, and are very quickly able to see where the model is getting stuck.&lt;/p&gt;&lt;h2&gt;So what did we learn?&lt;/h2&gt;&lt;p&gt;Generating our MCP server this way has produced a great feedback loop for our API documentation. Good descriptions improve agent understanding as much as humans. We made dozens of tweaks to our API comments as clients hit issues, which in turn are pulled into generated bindings for developers (and coding agents), and into future hosted API docs.&lt;/p&gt;&lt;p&gt;While not everything in our REST API cleanly fits into MCP, keeping them coupled with conditional generation logic lets us ensure our MCP server gets treated the same as any other surface in the product.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;&lt;/p&gt;</content:encoded><author>Eric Chiang</author></item><item><title>The performance bug hiding in our billing settings</title><link>https://oblique.security/blog/the-performance-bug-hiding-in-our-billing-settings/</link><guid isPermaLink="true">https://oblique.security/blog/the-performance-bug-hiding-in-our-billing-settings/</guid><description>How we spent two months debugging Cloud Run latency, built out our tracing along the way, and learned the fix was one line of YAML.</description><pubDate>Tue, 05 May 2026 16:00:00 GMT</pubDate><content:encoded>&lt;p&gt;At Oblique, we run our backend on &lt;a href=&quot;https://cloud.google.com/run&quot;&gt;Google Cloud Run&lt;/a&gt;. It&amp;#x27;s a great fit for what we need: a Go binary, a Postgres database, and a handful of background jobs that sync data from upstream identity providers and integrations. Cloud Run handles the lifecycle, we ship containers, everyone is happy.&lt;/p&gt;&lt;p&gt;Until we started noticing slow database query performance.&lt;/p&gt;&lt;p&gt;This is the story of how it took us two months on and off building out our amazing observability stack: adding traces, switching tracing backends, enabling query insights on our database, chasing locks, hypothesizing about VPC networking… and ultimately learning that Cloud Run was throttling our CPU outside of HTTP requests. We didn’t need all of that, and a single one-line annotation on our Cloud Run service would have fixed the entire problem from day one:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;metadata:
  annotations:
    run.googleapis.com/cpu-throttling: &quot;false&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The rest of this post is about how we got there, why we didn&amp;#x27;t get there sooner, and why we’re still happy we did all the work.&lt;/p&gt;&lt;h3&gt;The symptoms&lt;/h3&gt;&lt;p&gt;Our backend has two kinds of work. The first is the obvious kind: HTTP requests from the frontend, served by &lt;a href=&quot;https://oblique.security/blog/type-safe-frontend-apis/&quot;&gt;a Connect handler&lt;/a&gt; running each request in a goroutine. The second kind is &lt;a href=&quot;https://oblique.security/blog/go-synctest/&quot;&gt;background work&lt;/a&gt;: goroutines are periodically kicked off by the same binary that serves the frontend, and these goroutines handle anything performance intensive: syncing data from identity providers and other integrations, recomputing group memberships, running garbage collection, and so on.&lt;/p&gt;&lt;p&gt;The HTTP request endpoints seemed to be unaffected, but the background jobs were unreasonably slow, even considering how much work they were doing.&lt;/p&gt;&lt;p&gt;We were seeing single SQL queries that should have hit an index and returned in tens of milliseconds take &lt;em&gt;five seconds&lt;/em&gt;. Background sync jobs that locally took a couple of seconds were stretching to ten minutes or more. Our database connection pool (4 connections per instance) was constantly saturated, with pool acquisition calls themselves taking multiple seconds.&lt;/p&gt;&lt;p&gt;Our first instinct was to suspect the database queries, so that’s where we started.&lt;/p&gt;&lt;h3&gt;Step 1: dive into the data&lt;/h3&gt;&lt;p&gt;The first thing we did was look at the traces we already had. We&amp;#x27;d been exporting traces to Google Cloud Trace from our Go backend for a while. Looking at our existing data, something was clearly wrong. Some background jobs were taking 5-10 minutes in production, but our existing traces weren&amp;#x27;t telling us &lt;em&gt;why&lt;/em&gt;. We could see that a span was slow, but we couldn&amp;#x27;t see what the goroutine was actually doing during that time, what error it hit, or what query it was running. Our spans had almost no metadata. We could see the shape of the slowness without seeing the cause.&lt;/p&gt;&lt;p&gt;So, we started enriching our traces. &lt;a href=&quot;https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability&quot;&gt;GCP recently added support&lt;/a&gt; for consuming the OpenTelemetry trace format (OTLP) directly. We rolled out &lt;a href=&quot;https://opentelemetry.io/&quot;&gt;OpenTelemetry&lt;/a&gt; behind a feature flag, both to allow us to run trace collection locally, and to lean into the OTel ecosystem of libraries for observability data. It allowed us to run a &lt;a href=&quot;https://www.jaegertracing.io/&quot;&gt;Jaeger&lt;/a&gt; collector locally and inspect traces in real time while running the application against a local Postgres instance. If we could reproduce the slowness locally with better tooling, we&amp;#x27;d have a much faster debug loop.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/825967b8693480de17d29ab16e49df5432562dec-1999x1155.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;The results were immediate and clarifying: not a single trace locally took longer than one second against a local postgres DB. Whatever was happening, it wasn&amp;#x27;t a property of our code or our queries in isolation. Since our production environment is different from our local development environment, and identical code running in both environments exhibited different characteristics, that’s where the problem had to be.&lt;/p&gt;&lt;h3&gt;Step 2: chase every hypothesis&lt;/h3&gt;&lt;p&gt;Once we turned on OTel in production, we could see actual error messages on failed spans. We could see the &lt;code&gt;pgx&lt;/code&gt; driver&amp;#x27;s &lt;code&gt;pool.Acquire&lt;/code&gt; as its own span. We could see individual queries with their durations. We started forming hypotheses. We had a lot of them.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Hypothesis 1: too many SQL calls.&lt;/strong&gt; One of our background jobs was making roughly 1,500 SQL calls per run, mostly because we weren&amp;#x27;t batching anything. We started batching the &lt;code&gt;SELECT&lt;/code&gt;s and &lt;code&gt;INSERT&lt;/code&gt;s. This was the kind of cleanup we&amp;#x27;d been meaning to do for a while, and it did help, but not enough to explain the multi-second outliers. So, that wasn’t the issue.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Hypothesis 2: lock contention in the database.&lt;/strong&gt; We thought maybe write locks from one job were blocking reads in another. We wrote some Postgres queries to look for blocked queries, looked at the Cloud SQL metrics, and turned on &lt;a href=&quot;https://cloud.google.com/sql/docs/postgres/using-query-insights&quot;&gt;Query Insights&lt;/a&gt;. That helped us understand what part of the query was taking a long time, and the trend of queries over time. Something that was very confusing was just how big the deviation between individual identical queries was: taking anywhere from 200ms to 40s to run.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/93c344b3f097aed1d1dc61327dc4b5d5f2f05120-1999x302.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Another curious artifact that Query Insights allowed us to see was that we seemed to spend a lot of time downloading the result rows from the database:&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/85058684d7ec6af148a04d83608350f59f329d4e-930x341.jpg?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;We couldn’t find any evidence of lock contention, so it wasn’t that either.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Hypothesis 3: connection pool saturation.&lt;/strong&gt; With only 4 connections per instance, any slow query starves the pool. This was clearly &lt;em&gt;part&lt;/em&gt; of the problem. When you can see &lt;code&gt;pool.Acquire&lt;/code&gt; taking seconds, the pool is your bottleneck. But it was a symptom, not a cause. Something was making individual queries slow, and the pool was just amplifying it. We were on the right track, but the pool itself wasn’t the issue.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Hypothesis 4: network latency between Cloud Run and Cloud SQL.&lt;/strong&gt; The working theory at this point was that network latency was adding just enough overhead to cause individual connections to pile up. This became a leading hypothesis after we turned on Query Insights and saw that the majority of the query time was spent in downloading the result rows.&lt;/p&gt;&lt;p&gt;We then tried switching our Cloud SQL connection to using a private network connection, but it made no noticeable difference. A change we made at the same time however &lt;em&gt;did&lt;/em&gt; make a surprising difference.&lt;/p&gt;&lt;h3&gt;Cloud Run execution environments&lt;/h3&gt;&lt;p&gt;Cloud Run has two &lt;a href=&quot;https://cloud.google.com/run/docs/configuring/execution-environments&quot;&gt;execution environments&lt;/a&gt;. The first-generation environment is gVisor-based, a userspace kernel that intercepts syscalls. The second-generation environment runs your container in a Linux microVM with a real kernel. If you don&amp;#x27;t explicitly specify an environment, &lt;a href=&quot;https://cloud.google.com/run/docs/configuring/execution-environments#how_cloud_run_chooses&quot;&gt;Cloud Run picks one for you based on the features your service uses&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;https://cloud.google.com/run/docs/configuring/execution-environments#choosing&quot;&gt;execution environments documentation&lt;/a&gt; recommends using the second generation environment if your service has CPU-intensive workloads or could benefit from faster network performance, so we pinned our containers to that environment, because we were willing to try anything at this point. After we’d made the switch, query performance improved:&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/6bb74cd0a09940e17e1639989d760857d697edd7-1871x416.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/b90086fa8c67bf80e1ba0e62b6c18509774e4364-1999x477.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Pinning to the second generation environment was clearly an improvement. However, shortly after celebrating, we discovered to our horror that the query metrics went back up again. Not as bad as before, but still not where we wanted it to be.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/4ba5088ce1ab6fd64cde27232acab937bd9a4784-1999x995.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;We still didn’t understand what was going on, but without a clear way forward we went back to working on regular application features for a while, with the nagging feeling that something still wasn’t right.&lt;/p&gt;&lt;h3&gt;Two weeks later&lt;/h3&gt;&lt;p&gt;Background jobs were still slower than they had any business being. We had made some progress, and still believed that the networking between our containers and the database was somehow cursed, since the change to the second generation environment had improved things slightly. I had just submitted a PR to increase the minimum CPU allocated to our instances to 2 CPUs when we started discussing the possibility of CPU throttling. Reading through the &lt;a href=&quot;https://cloud.google.com/run/docs/configuring/billing-settings#choosing-background-execution&quot;&gt;Cloud Run billing settings docs&lt;/a&gt;, one line jumped out:&lt;/p&gt;&lt;blockquote&gt;The CPU allocated to all containers in an idle instance depends on the configured billing settings.&lt;/blockquote&gt;&lt;p&gt;Wait. &lt;em&gt;Idle&lt;/em&gt; instances? Is an instance with no in-flight HTTP request considered &amp;quot;idle,&amp;quot; even if it&amp;#x27;s running goroutines doing background work?&lt;/p&gt;&lt;p&gt;Cloud Run has two billing modes: request-based and instance-based. Under request-based billing (the default), your container&amp;#x27;s CPU is &lt;em&gt;throttled&lt;/em&gt; when no HTTP request is being processed. Not &amp;quot;scaled down.&amp;quot; Throttled. The degree of throttling depends &lt;em&gt;on the runtime environment&lt;/em&gt;. The reason we’d seen a speedup with the switch to the second generation environment wasn’t because of network performance, it was because &lt;a href=&quot;https://docs.cloud.google.com/run/docs/configuring/services/cpu#cpu-min&quot;&gt;it can’t throttle as aggressively&lt;/a&gt; as the first generation.&lt;/p&gt;&lt;p&gt;Our entire background processing system runs as scheduled goroutines that aren&amp;#x27;t tied to HTTP requests at all. Every single one of those goroutines had been running under heavy CPU throttling the entire time. The fix is an annotation, which translates to setting your instances to instance-based billing:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;metadata:
  annotations:
    run.googleapis.com/cpu-throttling: &quot;false&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now your container gets full CPU all the time, regardless of whether it&amp;#x27;s currently handling an HTTP request. You pay slightly more, because you&amp;#x27;re billed for the wall-clock time your instance exists rather than just the time it&amp;#x27;s serving requests, but for a service like ours, which runs background jobs continuously anyway, that’s the behavior we expect.&lt;/p&gt;&lt;p&gt;We rolled out the change and watched the metrics dashboards. The 99th percentile query latency on our background jobs collapsed:&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/9dd4d582a56492937855c0aae7a21183a9864114-1999x1255.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Background jobs that had been taking tens of minutes now finished in under thirty seconds, the remaining time limited mostly by integration APIs rate limiting. Pool acquisition contention dropped to nothing because individual queries were now finishing as fast as they should have all along. All of the batching work, all of the trace improvements, all of the VPC investigation, none of it was the actual fix. The fix was one annotation that, in retrospect, we should have set on day one.&lt;/p&gt;&lt;h3&gt;Why we don’t regret the work we put in&lt;/h3&gt;&lt;p&gt;The temptation, after a debugging journey like this, is to conclude that we wasted two months on and off on work we didn’t need. We didn&amp;#x27;t.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;OpenTelemetry was worth it.&lt;/strong&gt; The richer span metadata, the ability to run a local trace collector, the integration ecosystem. All of it survived past the fix and is still paying dividends. When the next mystery slow path shows up, we&amp;#x27;ll see it faster.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Query Insights is worth keeping on. &lt;/strong&gt;&lt;a href=&quot;https://docs.cloud.google.com/sql/docs/mysql/using-query-insights#pricing&quot;&gt;Enabling it is free&lt;/a&gt;, you only pay for data retention. It didn&amp;#x27;t help us find this particular bug, but it may help us narrow down issues with slow running queries quicker in the future. It’s better to have it before you need it. If you&amp;#x27;re running on Cloud SQL and you don&amp;#x27;t have it on, turn it on.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The batching work was worth doing.&lt;/strong&gt; 1,500 sequential &lt;code&gt;SELECT&lt;/code&gt;s in a single background job was a problem regardless of CPU throttling. We would have had to fix that eventually. We just fixed it earlier than we needed to.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Pinning the execution environment was worth doing.&lt;/strong&gt; The Cloud Run scheduler picking a different environment under your feet makes for an unpredictable metric baseline. Even if the environment generation difference hadn&amp;#x27;t mattered for latency, the fact that the choice is invisible and non-deterministic is enough reason to pin it explicitly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The annotation we should have set on day one.&lt;/strong&gt; If you&amp;#x27;re running on Cloud Run and you have any kind of in-application background work (goroutines you start from a request handler, scheduled jobs, async queue consumers, anything), set &lt;code&gt;run.googleapis.com/cpu-throttling: &amp;quot;false&amp;quot;&lt;/code&gt;. The performance characteristics of CPU-throttled background work are so bad, and so non-obviously bad, that the marginal cost of always-on CPU is basically irrelevant compared to the cost of debugging this for two months.&lt;/p&gt;&lt;h3&gt;Why didn’t we find it sooner?&lt;/h3&gt;&lt;p&gt;We had good observability, better than most. We were actively looking. Why did we spend two months on and off on hypotheses that weren&amp;#x27;t the answer before stumbling onto the one that was?&lt;/p&gt;&lt;p&gt;There are three reasons.&lt;/p&gt;&lt;p&gt;The first is that the symptoms looked exactly like a database problem. Multi-second SQL queries, connection pool exhaustion, lock contention candidates. When you see those symptoms, your professional instinct is to investigate the database. And the database was, in some sense, &lt;em&gt;not actually slow&lt;/em&gt;. It was the &lt;em&gt;client&lt;/em&gt; that was being suspended mid-flight by a &lt;code&gt;cgroup&lt;/code&gt;, holding a connection while it slept. From the database&amp;#x27;s perspective, the client just took a really long time to send or receive the next byte. From our traces&amp;#x27; perspective, the query took five seconds. Same data, completely different interpretation.&lt;/p&gt;&lt;p&gt;The second is that &amp;quot;Cloud Run throttles your CPU when no request is in flight&amp;quot; is a non-obvious property of the platform. It is documented. It just isn&amp;#x27;t documented in the places you&amp;#x27;d look while you&amp;#x27;re debugging slow code. The &lt;a href=&quot;https://cloud.google.com/run/docs/configuring/billing-settings&quot;&gt;billing settings page&lt;/a&gt; describes the two billing modes accurately, but it&amp;#x27;s framed entirely as a cost question, not a performance one. The &lt;a href=&quot;https://cloud.google.com/run/docs/tips/general#background-activity&quot;&gt;general development tips&lt;/a&gt; page mentions in a single sentence that you should use instance-based billing for background activity, but it&amp;#x27;s one bullet among dozens of unrelated tips. The &lt;a href=&quot;https://cloud.google.com/blog/products/serverless/cloud-run-gets-always-on-cpu-allocation&quot;&gt;2021 launch blog post&lt;/a&gt; for the always-on CPU feature is the clearest explanation of the model, and it&amp;#x27;s a four-year-old blog post. None of these are the pages you read when your database query performance is bad.&lt;/p&gt;&lt;p&gt;We aren&amp;#x27;t the only people to have hit this. After we figured it out, I traded notes with another Cloud Run user who&amp;#x27;d run into the same problem from a different angle. They were doing background processing triggered by Pub/Sub, and they&amp;#x27;d convinced themselves it couldn&amp;#x27;t be a CPU allocation issue, because the docs say CPU is allocated &amp;quot;during request processing time&amp;quot; and they had a request in flight the whole time the processing was happening. What they&amp;#x27;d missed was that they were acknowledging the Pub/Sub message early to avoid redelivery, and continuing the actual work in the background afterwards. The HTTP request had returned. Cloud Run considered the instance idle. The work that the user thought of as &amp;quot;request processing&amp;quot; was, from the platform&amp;#x27;s perspective, background work, and it was being throttled accordingly. The phrase &amp;quot;request processing time&amp;quot; is doing a lot of work in those docs, and it doesn&amp;#x27;t mean what most people reading it would assume.&lt;/p&gt;&lt;p&gt;The third reason is subtle and unexpected: Go handled this absurdly under-resourced situation with grace. Our service was being asked to do real work on roughly 5% of a vCPU, and it kept working. Goroutines parked, the scheduler kept multiplexing, the runtime kept making forward progress. Queries finished eventually. Background jobs completed eventually. Nothing crashed, nothing OOM&amp;#x27;d, nothing fell over. The system was, in a meaningful sense, &lt;em&gt;operating correctly&lt;/em&gt; the whole time. It was just slow.&lt;/p&gt;&lt;p&gt;That&amp;#x27;s a real strength of the Go runtime, and we&amp;#x27;re glad we wrote the backend in Go. But it also presented us with a steady-state of &amp;quot;everything still works, it&amp;#x27;s just bad,&amp;quot; which is one of the hardest failure modes to debug, because it doesn&amp;#x27;t trigger any of the signals operators normally rely on. The dashboards were green. The error rate was fine. The service was up. It was just inexplicably slow, in ways that wasn’t visible to users, for months.&lt;/p&gt;&lt;p&gt;If there&amp;#x27;s a moral to this story, it&amp;#x27;s that observability is necessary but not sufficient. We could see &lt;em&gt;exactly&lt;/em&gt; what was happening (pool waits, slow queries, sluggish goroutines) and the picture was so consistent with &amp;quot;the database is overloaded&amp;quot; that we kept looking at the database. Better data doesn&amp;#x27;t fix that on its own. You also need to keep poking at your assumptions about the platform you&amp;#x27;re running on.&lt;/p&gt;&lt;p&gt;Clearly the real moral of the story though, is to read the billing docs.&lt;/p&gt;</content:encoded><author>Johan Brandhorst-Satzkorn</author></item><item><title>How Oblique handles code review etiquette</title><link>https://oblique.security/blog/how-oblique-handles-code-review-etiquette/</link><guid isPermaLink="true">https://oblique.security/blog/how-oblique-handles-code-review-etiquette/</guid><description>Code reviews aren’t just about ensuring engineering quality, they’re also about building team culture. Focus on communication and understanding rather than nit-picking style quirks. </description><pubDate>Thu, 30 Apr 2026 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Code reviews can be a source of immense anxiety for engineering teams. Many of us have had bad experiences with nitpicking, bike-shedding, or that’s-not-how-I-would-have-done-it-itis. This is such a common pain point that there are &lt;a href=&quot;https://dsl.pubpub.org/pub/cranx-workbook/release/1&quot;&gt;entire research projects&lt;/a&gt; built around code review anxiety and how to reduce it.&lt;/p&gt;&lt;p&gt;The age of agentic coding brings a whole new layer to this dynamic: a fellow engineer disappearing into a cave and coming back with 12,000 lines of code you were expected to review that same day used to be a cautionary horror story; now that could be an average Tuesday with a poorly supervised agent. Do you let it by with a cursory LGTM? Do you throw another agent at it to do the review for you? Do you even still have coworkers?&lt;/p&gt;&lt;p&gt;It doesn’t have to be like this.&lt;/p&gt;&lt;p&gt;The fun thing about building a startup is you also get to build an engineering team from scratch, with your own engineering culture: one that hopefully avoids the worst of the friction from past jobs. We’re small enough now that we can talk as a group about how things should be done, but we won’t stay this size forever. Given how quickly norms are changing across the industry, we wanted to take some time to write down how we navigate code reviews at Oblique so that we’re all on the same page about how we want to collaborate as a team.&lt;/p&gt;&lt;p&gt;Here’s where our internal guidance landed.&lt;/p&gt;&lt;h2&gt;Know what the goal is&lt;/h2&gt;&lt;p&gt;The explicit goal of code review is to ensure high-quality engineering. The implicit goal, which is just as important, is to ensure high-quality &lt;strong&gt;communication&lt;/strong&gt; among team members. Even on a small team it is easy to lose track of how our product and codebase are changing, and code reviews are a crucial tool for sharing context and thinking.&lt;/p&gt;&lt;p&gt;That means both authors and reviewers should treat code reviews primarily as an act of collaborative learning. Do not send slop PRs, which are disrespectful of the reviewer’s time. Do not send slop reviews, which are disrespectful of the author’s effort.&lt;/p&gt;&lt;p&gt;PRs are not a good venue for debating architecture, and should not replace engineering discussions and design documents. Good communication starts long before the code gets written. If you as a reviewer are surprised by the contents of PR, that’s a sign that there’s a gap in the team’s communication somewhere. If a reviewer has concerns about the fundamental approach of a change, that should be expressed as a top-level comment or, better yet, through a conversation with the author.&lt;/p&gt;&lt;h2&gt;Optimize for the reviewer&lt;/h2&gt;&lt;p&gt;We all have different styles of working, but at the end of the day, someone else has to be able to read and understand your code. Oftentimes that someone else is you, six months (or days) from now.&lt;/p&gt;&lt;p&gt;Consider how diffs read and attempt to minimize them. Before sending a PR for review, authors should read through their own change in the GitHub UI—you’d be amazed at what mistakes you’ll suddenly spot when you’re looking at your own code in a different context. Reduce changes that aren’t relevant to the core logic, ensure tests have clear names, comment complicated blocks, follow the existing style of the surrounding code.&lt;/p&gt;&lt;p&gt;PR size is more of an art than a science. Keep PRs self-contained, brief, and reviewable. Large PRs can become functionally impossible to review, while extremely small ones lose the context of the change. Always attempt to compose the smallest PR that stands on its own. Wherever possible, work iteratively and merge code behind feature flags instead of developing out-of-tree.&lt;/p&gt;&lt;h2&gt;Author responsibilities&lt;/h2&gt;&lt;p&gt;We trust one another to use the best tools for the job, and we also expect that authors are responsible for all code they send. This means that authors should read, review, and &lt;strong&gt;understand&lt;/strong&gt; any changes generated by coding agents. Authors are not required to disclose their use of AI tools, and reviewers should have the same standards regardless of how a change was created.&lt;/p&gt;&lt;p&gt;What &lt;em&gt;is&lt;/em&gt; important to disclose is the maturity and intent of your change. If you threw a prototype together just to explore a concept and you don’t care if it ever gets merged, regardless of whether that prototype was handcrafted by you or by Claude, that should be expressed clearly in the PR description so that reviewers know what kind of feedback to provide.&lt;/p&gt;&lt;p&gt;Agentic tools have a tendency to produce functions with long names and flowery multi-line comments. These often have little substance despite their verbosity. Reviewers will always appreciate a hand written test name, or a few comments on the key lines over a spaghetti of helpers.&lt;/p&gt;&lt;p&gt;Every commit to main must work. This means if a backend change introduces something that breaks the frontend or vice-versa, it must be feature-flagged or the exception must be handled. Authors alone bear the risks for large branches and conflicts that occur because of them. Moving fast is not an excuse for breaking things.&lt;/p&gt;&lt;h2&gt;Reviewer responsibilities&lt;/h2&gt;&lt;p&gt;Prioritize code reviews over writing code. A reviewer that lets their queue get out of hand blocks the whole team. Each reviewer is expected to set up their notification workflow such that they can provide timely responses. (At Oblique, we rely heavily on the Slack and Linear integrations for GitHub.)&lt;/p&gt;&lt;p&gt;Trust your teammates to take your feedback seriously. Where possible, approve the PR with comments to avoid a second round-trip.&lt;/p&gt;&lt;p&gt;Ask for help when you need it. If a PR feels too unwieldy to understand, it’s okay (encouraged even!) to ask for it to be broken up into smaller pieces. Everyone benefits from a better shared understanding of the logic. Alternatively, don’t underestimate the power of asking to walk through a review together.&lt;/p&gt;&lt;p&gt;It’s also okay to ask an agent to help you digest the PR, as long as you remember that the end goal is for &lt;em&gt;you&lt;/em&gt; to understand the PR. If you’re just copying and pasting an agent’s review output into GitHub, you’re not exactly adding anything the author couldn’t have done themselves. &lt;/p&gt;&lt;p&gt;Reviews should focus on APIs and the tests for those APIs. Don’t overindex on style or internal details, unless those have a noticeable impact on performance, security, or readability. Critiques should be constructive and substantive - there are many different ways of accomplishing the same thing, and someone else choosing a different path than you does not necessarily mean they’re wrong. If there is a substantive reason to align on a specific style, add a linter or update the relevant &lt;code&gt;AGENTS.md&lt;/code&gt; files.&lt;/p&gt;&lt;p&gt;Always remember that your teammate is on the receiving end of the review. A good reviewer takes a moment or two to digest the change as a whole before providing feedback. Avoid knee-jerk reactions and brigading, and consider the tone of your comments. When in doubt, discuss face-to-face.&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;p&gt;Our team is still small, so these expectations seem obvious to all of us now. The good thing about writing this down is that when we fall short of our ideals, we have documented guidance that we can revisit to course-correct. It also means we have a historical snapshot of our thinking, which makes it easier to see if our old practices are no longer serving us. Software engineering has always been a dynamic practice, and these days it’s changing faster than ever. Now we have a baseline.&lt;/p&gt;</content:encoded><author>Jenny Zhang</author></item><item><title>The $0 security stack</title><link>https://oblique.security/blog/security-stack/</link><guid isPermaLink="true">https://oblique.security/blog/security-stack/</guid><description>As part of our initial SOC2 audit, we wanted to put in place actually useful security tools. Luckily, in 2026, world-class security costs literally nothing.</description><pubDate>Mon, 06 Apr 2026 17:28:18 GMT</pubDate><content:encoded>&lt;p&gt;In building a tool that determines what access is allowed in your environment and which integrates with your most sensitive internal systems (your identity provider and your HRIS), we have to care about security. Not just because we like it, and not just for compliance, though compliance is often the motivation to get started with security.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;For many SaaS tools, the security features you need (like SSO, SCIM, or audit logs) are often on the highest price tier, because they’re &amp;quot;enterprise&amp;quot; features. You need to pay for security features — but in recent years, there’s been a different approach with security vendors. Between open source and &lt;a href=&quot;https://ventureinsecurity.net/p/product-led-growth-in-cybersecurity&quot;&gt;product-led growth&lt;/a&gt;, security doesn’t have to cost a lot. In fact, for a team like ours, it costs nothing.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We just completed our first SOC2 audit period, and as part of that, we set up our initial security stack. We’ve set up &lt;em&gt;literally world-class&lt;/em&gt; security tools, without it costing us a cent.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Here’s our $0 security stack.&lt;/p&gt;&lt;h3&gt;Semgrep for SAST and SCA&lt;/h3&gt;&lt;p&gt;We use &lt;a href=&quot;https://semgrep.dev/&quot;&gt;Semgrep&lt;/a&gt; for code analysis and supply chain analysis (CC8.1). It’s free for up to 10 contributors, so it’s free for us. It’s made some valuable suggestions, such as forcing a higher minimum version of TLS or using a different library. The AI-based triage means that issues are automatically triaged as false positives based on context in the codebase, or dependencies that aren’t reachable aren’t prioritized.&lt;/p&gt;&lt;h3&gt;TruffleHog for secret scanning&lt;/h3&gt;&lt;p&gt;We use &lt;a href=&quot;https://trufflesecurity.com/trufflehog&quot;&gt;TruffleHog&lt;/a&gt; on every commit to make sure we don’t accidentally leak secrets. We initially ran the open source Action on every commit, but now we run it daily to verify we haven’t regressed. It hasn’t caught anything, which is a good thing!&lt;/p&gt;&lt;h3&gt;RunReveal for SIEM&lt;/h3&gt;&lt;p&gt;We use &lt;a href=&quot;https://runreveal.com/&quot;&gt;RunReveal&lt;/a&gt; as our SIEM, which ingests logs from our infrastructure (GCP, Cloudflare, and GitHub). Once we set up log streaming, we turned on the built-in detections (CC7.2), and hooked it up as an alert to a private #siem Slack channel. RunReveal’s Community tier supports 5 data sources and generous retention. It found some misconfigurations we had in the first 24h, but since then the only time it’s gone off was over the holidays when it thought our log streaming was broken since no one was deploying 😂.&lt;/p&gt;&lt;h3&gt;Sublime Security for email security&lt;/h3&gt;&lt;p&gt;We connected &lt;a href=&quot;https://sublime.security/&quot;&gt;Sublime Security&lt;/a&gt; to our Google Workspace, and have it send alerts to the #siem Slack channel as well. Their Core tier is free for up to 100 mailboxes.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We assume that phishing attempts (that we absolutely get sent) will occasionally succeed, and so rely on SSO and/or MFA across all systems to limit the damage of a single click, but also use email security tools to flag and mitigate suspicious emails as an additional defense. I strongly prefer this to doing phishing simulations (which might get brought up in CC2.2), and &lt;a href=&quot;https://people.cs.uchicago.edu/~grantho/papers/oakland2025_phishing-training.pdf&quot;&gt;which we know don’t work&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Sublime detects more than Google out of the box, but we still get a lot more false positives than ideal, so we are now testing out another solution in parallel.&lt;/p&gt;&lt;h3&gt;Apple Business for MDM (now free)&lt;/h3&gt;&lt;p&gt;Unfortunately, I lied, there is one security tool we &lt;del&gt;pay&lt;/del&gt; paid for: an MDM. (Everyone else, please don’t charge me!). It may seem crazy to have an MDM at a company our size, but… try deploying one later and you’ll regret it. (And I refuse to deploy a compliance vendor’s &lt;a href=&quot;https://osquery.io/&quot;&gt;osquery&lt;/a&gt; shim with no enforcement. If that’s the solution then we could just deploy osquery ourselves… which also does not spark joy.)&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We use &lt;a href=&quot;https://www.apple.com/newsroom/2026/03/introducing-apple-business-a-new-all-in-one-platform-for-businesses-of-all-sizes/&quot;&gt;Apple Business&lt;/a&gt; (formerly Apple Business Essentials) to enforce disk encryption (CC6.7), password lock (CC6.1), force updates (CC7.1) — and share WiFi creds. (Note that Mac devices have &lt;a href=&quot;https://support.apple.com/guide/security/protecting-against-malware-sec469d47bd8/web&quot;&gt;XProtect&lt;/a&gt; anti-malware (CC6.8), and it can’t be disabled.) It was already less than the cost of Notion… and now it’s free.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As a founder, I get it — every dollar matters. But if you&amp;#x27;re already paying for compliance, the incremental cost of doing security &lt;em&gt;correctly&lt;/em&gt; is smaller than you think. &amp;quot;We can&amp;#x27;t afford security&amp;quot; is almost never true in 2026, even if you’re a startup. There&amp;#x27;s more excellent, free tooling than you realize. You just have to look.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;(Also, it’s not lost on me that we don’t have a free tier of Oblique yet for access requests and access reviews (CC6.2, CC6.3). Give us a bit more time.)&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Passkey PRFs for end-to-end encryption</title><link>https://oblique.security/blog/passkey-prf/</link><guid isPermaLink="true">https://oblique.security/blog/passkey-prf/</guid><description>The passkey PRF extension lets syncable credentials do much more than login users. See how apps are using this for end-to-end encryption.</description><pubDate>Wed, 25 Feb 2026 17:23:00 GMT</pubDate><content:encoded>&lt;p&gt;End-to-end encrypted apps have always had to reckon with users losing their encryption keys. If you brick your phone, you just assume you&amp;#x27;re never getting your Signal chats back. But as E2EE has found its way into consumer services, there are more seamless backup and syncs of encrypted data between devices and even into clients like web browsers.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The problem is humans are still bad at picking strong passwords, or memorizing 256 bits of random data.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This coming weekend, I’ll be speaking at &lt;a href=&quot;https://www.bsidesseattle.com/&quot;&gt;BSidesSeattle&lt;/a&gt; on how WhatsApp, Signal, and X (Twitter) leverage hardware security modules to encrypt data with a relatively weak passphrase or PIN (Saturday at 3:30pm in Track 4!). What didn&amp;#x27;t make the cut for that talk is the next generation of schemes of deriving encryption keys from passkeys.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Passkeys are a newish technology based on the security key protocols in browsers and mobile apps. As opposed to a physical hardware token, they are magically synced by your OS credential store or password manager (iCloud Keychain, Google Password Manager, Windows Hello, 1Password, etc.), and replace your password rather than act as a second factor.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As passkey syncing has dramatically improved over the last few years, services like &lt;a href=&quot;https://1password.com/blog/encrypt-data-saved-passkeys&quot;&gt;1Password&lt;/a&gt;, &lt;a href=&quot;https://bitwarden.com/blog/prf-webauthn-and-its-role-in-passkeys/&quot;&gt;Bitwarden&lt;/a&gt;, and &lt;a href=&quot;https://confer.to/blog/2025/12/passkey-encryption/&quot;&gt;Confer&lt;/a&gt; are leveraging an extension for deriving cryptographic material from passkeys. Today, if you enable WhatsApp encrypted chat backups, by default your backups are protected with a passkey, instead of a password that you have to memorize:&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/f2e157454a712951cac524d1a531a4bc906d936d-350x720.heif?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;This post covers passkey pseudo-random functions (PRFs), and their use in end-to-end encryption.&lt;/p&gt;&lt;h2&gt;Passkey PRFs&lt;/h2&gt;&lt;p&gt;Passkeys are extremely limited in the primitives they expose. Through the browser, you can only request a signature over a structured challenge. There&amp;#x27;s no API for signing arbitrary data or decryption, much less key derivation.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Using an &lt;a href=&quot;https://github.com/w3c/webauthn/wiki/Explainer:-PRF-extension&quot;&gt;underlying API&lt;/a&gt; originally intended for disk decryption, WebAuthn supports a &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Web_Authentication_API/WebAuthn_extensions#prf&quot;&gt;“prf” extension&lt;/a&gt; where the client combines a seed (called a “salt”) with a per-credential private value. This produces a random-looking but deterministic result that can be used to derive other cryptographic material like encryption keys.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;To enable this extension, pass a salt to the “extensions.prf.eval.first” argument for a registration or authentication request:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Register a passkey with the server and generate a PRF output.
const cred = await navigator.credentials.create({
  publicKey: {
    challenge, // Provided by the server.
    rp: {
      id: rpId,
      name: rpName,
    },
    user: {
      id: userId, // Provided by the server.
      displayName: username,
      name: username,
    },
    pubKeyCredParams: [
      { type: &quot;public-key&quot;, alg: -7 },
      { type: &quot;public-key&quot;, alg: -257 },
    ],
    extensions: {
      prf: {
        eval: {
          first: salt, // Provide a salt for a deterministic output
        },
      },
    },
  },
});&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The salt can be chosen based on the needs of an application. A static value is fine since the PRF is unique per-credential, so two users (or the same user on different sites) will always produce different outputs. Per-user salts can be useful for supporting rotation of key material for more advanced needs.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;One benefit of a static salt is that it can be provided during the login challenge, before you know which user is authenticating:&lt;br/&gt;&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Use a static salt value.
const salt = new TextEncoder().encode(&quot;my-sites-static-salt&quot;).buffer;

// Login the user and generate a PRF output all in one go.
const cred = await navigator.credentials.get({
  publicKey: {
    challenge, // Provided by the server.
    rpId,
    extensions: {
      prf: {
        eval: {
          first: salt,
        },
      },
    },
  },
});&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;A second salt can also be passed to the extension as a credential rotation primitive:&lt;br/&gt;&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// From the Mozilla docs
//
// https://developer.mozilla.org/en-US/docs/Web/API/Web_Authentication_API/WebAuthn_extensions#prf
({
  extensions: {
    prf: {
      eval: {
        first: currentSessionKey, // salt for current session
        second: nextSessionKey, // salt for next session
      },
    },
  },
});
&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;The PRF output&lt;/h2&gt;&lt;p&gt;The credential returned by the registration or authentication phase will contain a PRF result field.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const pubKey = cred as PublicKeyCredential;
const resp = pubKey.response as AuthenticatorAssertionResponse;
const ext = pubKey.getClientExtensionResults();
// Pseudo-random value
const { first } = ext.prf.results;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The client can then use the output to derive key material as input to an &lt;a href=&quot;https://en.wikipedia.org/wiki/Hybrid_cryptosystem#Envelope_encryption&quot;&gt;envelope encryption&lt;/a&gt; scheme or &lt;a href=&quot;https://en.wikipedia.org/wiki/Double_Ratchet_Algorithm&quot;&gt;Double Ratchet&lt;/a&gt; algorithm. In a more straightforward case, the client can feed the result into the Web Crypto API to encrypt data directly:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const prfKey = await crypto.subtle.importKey(
  &quot;raw&quot;,
  ext.prf.results.first,
  &quot;HKDF&quot;,
  false, // Not extractable
  [&quot;deriveKey&quot;],
);

// Derive an encryption key with a standard key derivation function.

// Salt for the HKDF. NOT the passkey PRF.
const hkdfSalt = crypto.getRandomValues(new Uint8Array(12));
const secretKey = await crypto.subtle.deriveKey(
  {
    name: &quot;HKDF&quot;,
    hash: &quot;SHA-256&quot;,
    salt: hkdfSalt.buffer,
    info: new TextEncoder().encode(&quot;note-encryption-key&quot;),
  },
  prfKey,
  { name: &quot;AES-GCM&quot;, length: 256 },
  false,
  [&quot;encrypt&quot;, &quot;decrypt&quot;],
);

// An &quot;initialization vector&quot; is a public value that MUST be unique per
// encryption event with a given symmetric AES key. Best practice is to
// generate it randomly for every call to &quot;encrypt&quot;.
const iv = crypto.getRandomValues(new Uint8Array(16));
// Encrypt with AES-GCM.
const ciphertext = await crypto.subtle.encrypt(
  { name: &quot;AES-GCM&quot;, iv },
  secretKey,
  new TextEncoder().encode(data), // Data to encrypt.
);

// Send encrypted data to the server to store as well as public metadata
// (IV and HKDF salt).
const encryptedData = new Uint8Array([
  ...iv,
  ...hkdfSalt,
  ...new Uint8Array(ciphertext)]).buffer;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Later, the client can decrypt the encrypted data held by the server using the same PRF output:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Receive the &quot;encryptedData&quot; back from the server.

// Extract the IV, HKDF salt, and ciphertext.
const iv = encryptedData.slice(0, 16);
const hkdfSalt = encryptedData.slice(16, 16 + 12);
const ciphertext = encryptedData.slice(16 + 12);

const secretKey = await crypto.subtle.deriveKey(
  {
    name: &quot;HKDF&quot;,
    hash: &quot;SHA-256&quot;,
    salt: hkdfSalt.buffer,
    info: new TextEncoder().encode(&quot;note-encryption-key&quot;),
  },
  prfKey, // Output from PRF.
  { name: &quot;AES-GCM&quot;, length: 256 },
  false,
  [&quot;encrypt&quot;, &quot;decrypt&quot;],
);

const plaintext = await crypto.subtle.decrypt(
  { name: &quot;AES-GCM&quot;, iv },
  secretKey,
  ciphertext,
);&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Putting it together&lt;/h2&gt;&lt;p&gt;As part of this post, we open-sourced a full demo with a React frontend and Go backend to run locally. You can find the code on GitHub:&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://github.com/oblique-security/webauthn-prf-demo&quot;&gt;https://github.com/oblique-security/webauthn-prf-demo&lt;/a&gt;&lt;/p&gt;&lt;div style=&quot;display:none&quot;&gt;Unknown block type &quot;videoEntry&quot;, specify a component for it in the `components.types` option&lt;/div&gt;&lt;p&gt;Passkey PRFs are an incredibly interesting primitive for cryptographic operations on a mobile app, or in a browser. This isn’t just limited to symmetric encryption. You could imagine seeding asymmetric key generation for SSH connections, or signing application prompts for audit logs. If you’re willing to put up with a little bit of JavaScript cryptography, there’s now a robust means to drive these schemes with key material that magically syncs between devices.&lt;/p&gt;</content:encoded><author>Eric Chiang</author></item><item><title>Fit access controls to your org, not the other way around</title><link>https://oblique.security/blog/org-structure/</link><guid isPermaLink="true">https://oblique.security/blog/org-structure/</guid><description>Groups used for access controls can be based on department, reporting chain, or projects. The right answer is whatever maps best to how your org actually works.</description><pubDate>Thu, 19 Feb 2026 21:20:13 GMT</pubDate><content:encoded>&lt;p&gt;Turns out, you mostly &lt;a href=&quot;https://hardcoresoftware.learningbyshipping.com/p/047-dont-ship-the-org-chart&quot;&gt;ship your org chart&lt;/a&gt;. That&amp;#x27;s &lt;a href=&quot;https://en.wikipedia.org/wiki/Conway%27s_law&quot;&gt;Conway&amp;#x27;s Law&lt;/a&gt;, and it applies to access controls just as much as it applies to software architecture, for better or worse.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;blockquote&gt;Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations. —  Melvin E. Conway&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Our job at Oblique isn&amp;#x27;t to tell you how your organization should be set up. It&amp;#x27;s to give you the tools to work with how it already is.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We’ve talked with a lot of organizations about access management. Although they all want to express access controls slightly differently, they’re all more similar than they are different. Your company may be special, but your org chart really isn’t all that special. For example, many organizations have a concept of a small ad hoc team working together on the same project that doesn’t necessarily map to where they sit in the org chart, whether they call it a “squad”, “pod”, or something else. The same concepts keep cropping up across organizations, although the specific implementations might differ.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Organizations mostly use one of three ways to group users together to decide who should be granted access to something:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;By department or function. For example, everyone in sales needs access to Salesforce. This is closest in spirit to &lt;a href=&quot;https://oblique.security/resources/access-control-models/&quot;&gt;role-based access control&lt;/a&gt;, but ironically it is often implemented using attributes, with an attribute of department or title. (In Oblique, these are &lt;a href=&quot;https://docs.oblique.security/start/concepts/#attribute-based-groups&quot;&gt;attribute-based groups&lt;/a&gt;.)&lt;/li&gt;&lt;li&gt;By manager or reporting chain. For example, everyone who reports to Samantha needs access to Zendesk, because Samantha runs customer success, which could include multiple departments like support, sales engineering, and DevRel. This is distinct from an attribute — it’s strictly about org structure. If Samantha gets promoted, the access controls follow whoever takes over her old job. (In Oblique, these are &lt;a href=&quot;https://docs.oblique.security/start/concepts/#reporting-groups&quot;&gt;reporting groups&lt;/a&gt;.)&lt;/li&gt;&lt;li&gt;By project. For example, everyone working on the MCP server needs access to PostHog, even though that project has engineering, as well as product, design, docs, marketing, and SRE. Often, this is tied to a specific resource rather than an application, like access to a particular Snowflake table rather than the Snowflake app. It can also be tied to a communication channel or group. (In Oblique, these are &lt;a href=&quot;https://docs.oblique.security/start/concepts/#team-groups&quot;&gt;team groups&lt;/a&gt;.)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;What an individual needs access to depends on a mix of all three group types. Although the specifics depend on the &lt;a href=&quot;https://newsletter.pragmaticengineer.com/i/138252015/drawing-the-famous-microsoft-oracle-amazon-facebook-google-apple-comic&quot;&gt;culture and org structure&lt;/a&gt;, almost every organization starts the same way: inherent access tied to employment status, so that all full-time employees get access to basic collaboration tools like Google Workspace and Slack. Additional access depends on what system and data is the source of truth for departments and teams — sometimes it’s the reporting chain, sometimes it’s a department or cost center attribute, and sometimes it’s a completely ad hoc, manually-managed team.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This complexity — of using multiple data sources and group types to manage access — is why access controls end up being so complicated. There are many stakeholders involved in org structure design, but IT is usually not one of them. But then IT is tasked with using an access control system that doesn’t fit reality, and so the result is a mess: groups based on data that isn’t maintained, and access that’s hard to trace and understand.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The goal with defining groups for access management is not only to make the initial setup sensible (so that you can reason about access), but also to make the system possible to maintain. Your decision depends on what data you have and where its source of truth lives. You need data that’s high fidelity, that someone is already maintaining, and that’s in a system that your access controls can integrate with, like your HRIS.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Your access control model needs to map to how your organization works, not the other way around. There is no single ‘right’ way to set up permissions using these groups. There is only the way that works for your org.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Access requests are a bandaid, not a fix</title><link>https://oblique.security/blog/access-bandaids/</link><guid isPermaLink="true">https://oblique.security/blog/access-bandaids/</guid><description>IT teams are overwhelmed with never-ending access requests. Getting off the identity treadmill means getting to fewer tickets over time, not faster tickets.</description><pubDate>Wed, 11 Feb 2026 18:53:19 GMT</pubDate><content:encoded>&lt;p&gt;For many IT and security teams, identity feels like a constant treadmill of access changes: add the new hire, remove that contractor, update access to the sales tool, and worst of all, make sure nothing breaks during next Monday’s reorg. To handle the increasing volume of these access changes, the identity industry has gotten very good at handling them quickly, by letting you request elevated privileges in a CLI, or approve access changes in Slack. But you’re still on a treadmill — we’re just making it go faster and faster. What if you want off the treadmill?&lt;/p&gt;&lt;h3&gt;The only real security improvements are solving classes of problems&lt;/h3&gt;&lt;p&gt;In many areas of security, we’ve learned that fixing classes of problems is the only way we can address them for good. Instead of finding (and fixing) an increasing number of insecure code findings, what if we could reduce the total volume of issues by moving to memory safe languages? Solving a broader class of problems is necessary to reduce the volume of issues.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;In vulnerability management, we already had automated patching, with tools like Dependabot or automated OS updates. But there is still an order of magnitude more impact by moving to a minimal base image. It’s not about applying patches faster, it’s about never getting them to begin with. Automated patching helped, but it wasn’t the real fix. Removing unnecessary dependencies was.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We haven’t had that mindset shift — and so that kind of impact — in access management. Access management is still the same: constantly updating and changing access as people’s roles and responsibilities change. It’s still human-intensive work.&lt;/p&gt;&lt;h3&gt;Access management is full of bandaids&lt;/h3&gt;&lt;p&gt;We’ve kept implementing bandaids to address the fact that our access management in practice doesn’t match it in reality. It makes sense to regularly review access to verify that it’s still correct — because why would you have checked since you first provisioned it? This sanity check was so desperately necessary that quarterly access reviews have become entrenched as compliance requirements across multiple industries, even if many security teams treat them as a checkbox exercise.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The next bandaid was expiring access. If we have access automatically expire, then there’s less to clean up. That makes sense. But expiring access isn’t widely available — most identity providers and applications don’t natively support it — which means it’s not frequently used outside of the highest risk situations, like prod access. Removing access automatically also means you need to think about how to reprovision it when users need it again.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Which brings us to the bandaid of Slack-based access. Rather than making a user file a ticket in Jira or ServiceNow, what if they ask for access where they already are, in Slack? This is a huge improvement in terms of user experience — but it leads to neverending access request tickets, for both the IT team to respond to, and for the user to file. The number of tickets has grown untenable. But faster approvals don’t fix the underlying issue. They just make living with it more tolerable.&lt;/p&gt;&lt;h3&gt;Access requests are a bandaid, not a fix&lt;/h3&gt;&lt;p&gt;With IT Ops, IT teams are moving from living in a ticket queue to building and automating operational systems. And as more and more IT teams report into the CISO, they’re adopting the security engineering mindset that’s emerged in the past few years. IT teams are using more automation and code than ever before — and they’re looking at their daily piles of access request tickets and asking why there’s not a better solution.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;You can only make approvals go so fast if your request flow itself is a bandaid, because you can’t express the policies you actually want. &lt;em&gt;You can’t automate a human&lt;/em&gt; — you’ll always be limited in how much you can automate if you explicitly decided to include a human in the loop.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The issue is both technology and process. &lt;a href=&quot;https://oblique.security/blog/policies-report/&quot;&gt;Rarely do you actually want a line people manager determining who gets access to prod&lt;/a&gt;, just like you don’t want them deciding what new project management tool you procure. But when IT and security teams require manager approval as part of an access request, it’s often as a proxy control, to ensure that an intern doesn’t access production, or a contractor doesn’t have visibility to customer data. If you can’t define the controls you actually want, you need to find alternative controls to meet the same requirements. We need better building blocks.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We need expiring access, so that a one-off production debug session doesn&amp;#x27;t turn into permanent prod access that no one remembers to revoke. We need auto-approvals for known, expected access changes — like access to the customer database related to debugging a specific issue, or as part of an incident — so that routine requests don&amp;#x27;t clog up the queue, while still keeping a record. (Access requests are still necessary, just not at the scale they&amp;#x27;re being used today.) And we really, really need deny policies: hard guardrails that prevent access that should never happen, without relying on someone to catch it in review. It&amp;#x27;s a lot easier to maintain a short list of what’s approved than to audit an ever-growing list of places you might have made a mistake.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Getting off the identity treadmill means getting to fewer tickets over time, not faster tickets. The only way to get there is to have your access controls and policies express what your organization actually wants, and the reality of where it is now, updated automatically over time as things change. That’s what we’re working on here at Oblique.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Go’s synctest is amazing</title><link>https://oblique.security/blog/go-synctest/</link><guid isPermaLink="true">https://oblique.security/blog/go-synctest/</guid><description>We threw Go’s new “testing/synctest” package at a particularly gnarly part of our codebase and were pleasantly surprised by how effective it was.</description><pubDate>Thu, 05 Feb 2026 16:59:28 GMT</pubDate><content:encoded>&lt;p&gt;We threw Go’s new “&lt;a href=&quot;https://pkg.go.dev/testing/synctest&quot;&gt;testing/synctest&lt;/a&gt;” package at a particularly gnarly part of our codebase and were pleasantly surprised by how effective it was. This post covers the synctest package, its nuances, and how it does much more than speed up your tests.&lt;/p&gt;&lt;p&gt;The main headline for Go 1.25’s synctest is its ability to magically advance time. Tests run in a “bubble” with a fake clock and calls to &lt;code&gt;time.Sleep&lt;/code&gt; are virtualized, causing them to seem to run instantly:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// This test runs instantly!
func TestSleep(t *testing.T) {
	synctest.Test(t, func(t *testing.T) {
		var t1, t2 time.Time
		go func() {
			time.Sleep(time.Second)
			t1 = time.Now()
		}()

		time.Sleep(time.Second * 2)
		t2 = time.Now()
		if t1.After(t2) {
			t.Errorf(&quot;Expected t2 (%s) to be after t1 (%s)&quot;, t2, t1)
		}
	})
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href=&quot;https://go.dev/play/p/nvp-Gc8WiL8&quot;&gt;&lt;em&gt;Run on the Go Playground&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;But even more than speed, synctest’s real advantage is being able to deterministically reason about the &lt;strong&gt;ordering&lt;/strong&gt; of events in a test. Let me explain.&lt;/p&gt;&lt;h2&gt;Background loops&lt;/h2&gt;&lt;p&gt;Our product has a large number of background routines. These can be as simple as deleting expired database rows, or more complex logic like leader election between instances.&lt;/p&gt;&lt;p&gt;These routines take the form of &lt;code&gt;Run&lt;/code&gt; functions that call another function in a loop (plus some backoff on failures). This is an abbreviated sample of what that might look like:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;type DB struct { /* */ }
func NewDB(ctx context.Context, path string) (*DB, error) { /* */ }
func (db *DB) Close() error { /* */ }
func (db *DB) CreateSession(ctx context.Context, id string, exp time.Time) error { /* */}
func (db *DB) GetSessionExpiry(ctx context.Context, id string) (time.Time, error) { /* */ }
func (db *DB) DeleteExpiredSessions(ctx context.Context, now time.Time) error { /* */ }

// Ummm... how do we test RunDeleteExpiredSessions?

func RunDeleteExpiredSessions(ctx context.Context, db *DB) {
	for {
		select {
		case &lt;-ctx.Done():
			return
		case &lt;-time.After(time.Minute):
			// Run every minute.
			if err := db.DeleteExpiredSessions(ctx, time.Now()); err != nil {
				log.Printf(&quot;Deleting expired sessions: %v&quot;, err)
			}
		}
	}
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;How do we write a test if we need to wait a minute for the logic to execute? This gets even more complicated with nested loops. Our leader election performs different sub-loops depending on if the current process is the leader or not. What do we do?&lt;/p&gt;&lt;h2&gt;synctest is about blocking&lt;/h2&gt;&lt;p&gt;synctest advances time when it considers all goroutines within the test “blocked”. There are a few rules, but this largely means channel operations, wait groups, and time methods.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://pkg.go.dev/testing/synctest#hdr-Blocking&quot;&gt;https://pkg.go.dev/testing/synctest#hdr-Blocking&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The following test demonstrates this by spawning three goroutines and blocking them on different conditions. The runtime sees that all goroutines are blocked, determines the smallest value passed to &lt;code&gt;time.Sleep()&lt;/code&gt;, and advances time exactly enough to unblock that call:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;func TestBlocking(t *testing.T) {
	synctest.Test(t, func(t *testing.T) {
		ch := make(chan struct{})
		events := []string{}
		wg := &amp;sync.WaitGroup{}
		wg.Add(1)
		
		doneCh := make(chan struct{})

		go func() {
			&lt;-ch // Blocked on channel read
			events = append(events, &quot;channel&quot;)
			close(doneCh)
		}()
		go func() {
			wg.Wait() // Blocked on wait group
			events = append(events, &quot;waitgroup&quot;)
			close(ch)
		}()
		go func() {
			time.Sleep(time.Second) // Blocked on time.Sleep
			events = append(events, &quot;sleep&quot;)
			wg.Done()
		}()

		&lt;-doneCh      // Also blocked on a channel read
		t.Log(events) // Will print [&quot;sleep&quot;, &quot;waitgroup&quot;, &quot;channel&quot;]
	})
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href=&quot;https://go.dev/play/p/qt0b4C15F4h&quot;&gt;&lt;em&gt;Run on the Go Playground&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;If there are multiple sleeps, each is unblocked in order until it blocks again or exits. This not only makes the test fast, but allows the &lt;strong&gt;use of time for synchronization&lt;/strong&gt; in a way that would be ill-advised otherwise.&lt;/p&gt;&lt;p&gt;For example, the following loop always increments the counter &lt;strong&gt;exactly&lt;/strong&gt; three times. In a “normal” program, this might occasionally only run twice based on when a goroutine happened to be scheduled.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;func TestLoop(t *testing.T) {
	synctest.Test(t, func(t *testing.T) {
		done := make(chan struct{})
		n := 0
		wg := &amp;sync.WaitGroup{}
		defer wg.Wait()
		wg.Go(func() {
			for {
				select {
				case &lt;-time.After(time.Minute): // Wait for a minute
					n++
				case &lt;-done:
					return
				}
			}
		})

		// Wait for three minutes and a millisecond
		time.Sleep((time.Minute * 3) + time.Millisecond)

		t.Log(n) // The loop always runs exactly three times and this prints &quot;3&quot;
		close(done)
	})
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href=&quot;https://go.dev/play/p/RWlfDmTI0Vl&quot;&gt;&lt;em&gt;Run on the Go Playground&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;Testing our database code&lt;/h2&gt;&lt;p&gt;Putting this together, we can cause &lt;code&gt;RunDeleteExpiredSessions&lt;/code&gt; to run its loop by sleeping longer than its call to &lt;code&gt;time.After&lt;/code&gt;. Since synctest then waits for the loop to block again, we can also be sure that the call to &lt;code&gt;DeleteExpiredSessions&lt;/code&gt; has completed when the test logic picks up again.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;func TestRunDeleteExpiredSessions(t *testing.T) {
	synctest.Test(t, func(t *testing.T) {
		// Create a database.
		ctx, cancel := context.WithCancel(t.Context())
		db, err := NewDB(ctx, filepath.Join(t.TempDir(), &quot;test.db&quot;))
		if err != nil {
			t.Fatalf(&quot;Creating test database: %v&quot;, err)
		}
		defer db.Close()
		
		// Start the deletion process in an external goroutine.
		wg := sync.WaitGroup{}
		defer wg.Wait()
		wg.Go(func() {
			RunDeleteExpiredSessions(ctx, db)
		})
		defer cancel() // Cause RunDeleteExpiredSessions to exit

		// Create a session that&apos;s valid for 30 seconds.
		exp := time.Now().Add(time.Second*30)
		if err := db.CreateSession(ctx, &quot;test-session&quot;, exp); err != nil {
			t.Fatalf(&quot;Creating session: %v&quot;, err)
		}

		// RunDeleteExpiredSessions runs every 1 minute. Block until the loop
		// runs once.
		time.Sleep(time.Minute+time.Second)
		
		// Verify that the garbage collector deleted the session.
		if _, err := db.GetSessionExpiry(ctx, &quot;test-session&quot;); !errors.Is(err, sql.ErrNoRows) {
			t.Errorf(&quot;GetSessionExpiry returned unexpected error on expired session: got=%s, want=%s&quot;,
				err, sql.ErrNoRows)
		}
	})
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And this runs in a fraction of a second:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;% go test -v 
=== RUN   RunDeleteExpiredSessions
--- PASS: RunDeleteExpiredSessions (0.01s)
PASS
ok  	example	0.340s&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We used synctest with our system’s core leader election logic that spawns a dozen of these loops, and it ran perfectly without any changes to the code. We were able to test leader election transitions where a new instance take over from the previous one, and pick arbitrary points in the process to stop and inspect the state of the database to verify.&lt;/p&gt;&lt;p&gt;There are definitely nuances to synctest, such as knowing what does or doesn&amp;#x27;t block, or ensuring that all the goroutines are cleaned up properly. You can read more about this on &lt;a href=&quot;https://go.dev/blog/synctest&quot;&gt;Go’s Blog&lt;/a&gt;, or the &lt;a href=&quot;https://pkg.go.dev/testing/synctest&quot;&gt;package docs&lt;/a&gt;.&lt;/p&gt;</content:encoded><author>Eric Chiang</author></item><item><title>Type safe frontend APIs</title><link>https://oblique.security/blog/type-safe-frontend-apis/</link><guid isPermaLink="true">https://oblique.security/blog/type-safe-frontend-apis/</guid><description>Introduction to using Protobuf and Connect for type-safe frontend API calls from the frontend to the backend.</description><pubDate>Thu, 29 Jan 2026 20:24:48 GMT</pubDate><content:encoded>&lt;p&gt;The benefits of schema-driven development have recently become more widely known in the backend. Cross-language RPCs, performance enhancements, compatibility guarantees, and a single source of truth for the API definitions are only some of the benefits. While &lt;a href=&quot;https://protobuf.dev/&quot;&gt;Protobuf&lt;/a&gt; and &lt;a href=&quot;https://grpc.io/&quot;&gt;gRPC&lt;/a&gt; provide the means for service-to-service communications, it remains difficult to accomplish this for frontend to backend communications. gRPC is itself not possible to use in the browser, because of &lt;a href=&quot;https://groups.google.com/g/grpc-io/c/8t4lXbgEPOU?pli=1&quot;&gt;its reliance on the obscure Trailer HTTP feature&lt;/a&gt;. &lt;a href=&quot;https://www.openapis.org/&quot;&gt;OpenAPI&lt;/a&gt; has been trying to provide an IDL for RESTful services, but support is fragmented (OpenAPIv2 vs OpenAPIv3), the quality of language implementations varies widely and the spec is incredibly broad.&lt;/p&gt;&lt;p&gt;A number of projects have popped up that aim to bridge the gap to the frontend using Protobuf as the IDL, such as:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://grpc.io/blog/state-of-grpc-web/&quot;&gt;gRPC-Web&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://github.com/grpc-ecosystem/grpc-gateway&quot;&gt;gRPC-Gateway&lt;/a&gt; (which I have maintained since 2018)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://github.com/twitchtv/twirp&quot;&gt;Twirp&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://connectrpc.com/&quot;&gt;Connect&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Many more&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Out of all of them, the most interesting one is Connect. Connect was created by &lt;a href=&quot;https://buf.build/&quot;&gt;Buf&lt;/a&gt;, but was donated to the CNFC and &lt;a href=&quot;https://www.cncf.io/projects/connect-rpc/&quot;&gt;now lives&lt;/a&gt; alongside &lt;a href=&quot;https://www.cncf.io/projects/grpc/&quot;&gt;gRPC&lt;/a&gt; as a community supported open-source project, making it more compelling to companies and independent projects alike. Connect is built on the ES2017 standard and so is compatible with all modern frameworks, with examples in major frameworks such as &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/react&quot;&gt;React&lt;/a&gt;, &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/nextjs&quot;&gt;Next.js&lt;/a&gt;, &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/vue&quot;&gt;Vue&lt;/a&gt;, &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/svelte&quot;&gt;Svelte&lt;/a&gt;, &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/angular&quot;&gt;Angular&lt;/a&gt;, &lt;a href=&quot;https://github.com/connectrpc/examples-es/tree/main/astro&quot;&gt;Astro&lt;/a&gt; and &lt;a href=&quot;https://github.com/connectrpc/examples-es&quot;&gt;many more&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Connect also &lt;a href=&quot;https://connectrpc.com/docs/web/query/getting-started&quot;&gt;integrates nicely&lt;/a&gt; with Tanstack Query, a popular framework-agnostic state management solution. At Oblique, we use the Connect API generators with our React frontend and a Connect-compatible backend using &lt;a href=&quot;https://github.com/connectrpc/vanguard-go&quot;&gt;Vanguard&lt;/a&gt;, another Buf project. Vanguard isn&amp;#x27;t necessary, and isn&amp;#x27;t part of this post. Vanguard also allows us to expose a REST-like JSON API, but that will be a topic of another blog post.&lt;/p&gt;&lt;h3&gt;The backend&lt;/h3&gt;&lt;p&gt;What does it look like to use? As with any Protobuf API, it starts with a .proto file. For this example, we’ll assume a Go backend, which is what we use at Oblique. Let’s define a basic service:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;edition = &quot;2023&quot;;

package mycorp.mybackend.v1;

// Important for Go generation, not necessary in general
option go_package = &quot;github.com/mycorp/myproject/gen/mycorp/mybackend/v1;mybackendv1&quot;;

service BackendService {
	rpc GetCurrentUser(GetCurrentUserRequest) returns (GetCurrentUserResponse);
}

message GetCurrentUserRequest {}

message GetCurrentUserResponse {
	string user_id = 1;
	string full_name = 2;
	string email = 3;
	string avatar_url = 4;
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;An RPC to get information about the currently logged in user is a very common part of any frontend API. This document shows what it might look like when expressed as a Protobuf RPC.&lt;/p&gt;&lt;p&gt;Lets save this file as &lt;code&gt;mycorp/mybackend/v1/backend.proto&lt;/code&gt;. Note how the Protobuf &lt;code&gt;package&lt;/code&gt; matches the folder structure we picked. We will use the &lt;a href=&quot;https://github.com/bufbuild/buf&quot;&gt;&lt;code&gt;buf&lt;/code&gt;&lt;/a&gt; Protobuf compiler to generate our client and server libraries. Install &lt;code&gt;buf&lt;/code&gt; using the method of your choice. We start by creating a &lt;code&gt;buf.yaml&lt;/code&gt; file to define lint rules and other configuration options:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;version: v2
lint:
	use:
    - STANDARD
breaking:
  use:
    - FILE&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We can now try running &lt;code&gt;buf build&lt;/code&gt; to see that everything is set up correctly:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ buf build&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If nothing happens, everything is working! Running &lt;code&gt;buf build&lt;/code&gt; gathers all of our Protobuf files and “compiles” them, ensuring there are no missing imports or syntax errors. To generate client and server libraries, we create a new file: &lt;code&gt;buf.gen.yaml&lt;/code&gt;, and define and install the Go &lt;a href=&quot;https://buf.build/docs/generate/tutorial/#generate-code-using-local-plugins&quot;&gt;plugins&lt;/a&gt; we want to use:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;version: v2
plugins:
  - local: protoc-gen-go
    out: gen
    opt:
      - paths=source_relative
  - local: protoc-gen-connect-go
    out: gen
    opt:
      - paths=source_relative&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We use the &lt;code&gt;paths=source_relative&lt;/code&gt; option to generate the Go files in the same folder structure as our Protobuf files use. We can now try running &lt;code&gt;buf generate&lt;/code&gt; to see that everything is working:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
$ go install connectrpc.com/connect/cmd/protoc-gen-connect-go@latest
$ buf generate&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We should see two new folders created:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;gen/mycorp/mybackend/v1/backend.pb.go
gen/mycorp/mybackend/v1/mybackendv1connect/backend.connect.go&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;They’re the generated Go files containing the type and service definitions for the services and messages in our Protobuf package. At this point, we will need to make sure we have a &lt;code&gt;go.mod&lt;/code&gt; file in place. Using the module name we foreshadowed in our &lt;code&gt;go_package&lt;/code&gt; option above, run:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ go mod init github.com/mycorp/myproject
$ go mod tidy&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now we are ready to implement the backend logic. Create a &lt;code&gt;main.go&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;package main

import (
	&quot;context&quot;
	&quot;log&quot;
	&quot;net/http&quot;

	&quot;connectrpc.com/connect&quot;
	mybackendv1 &quot;github.com/mycorp/myproject/gen/mycorp/mybackend/v1&quot;
	&quot;github.com/mycorp/myproject/gen/mycorp/mybackend/v1/mybackendv1connect&quot;
	&quot;google.golang.org/protobuf/proto&quot;
)

type backendService struct {
}

func (s *backendService) GetCurrentUser(ctx context.Context, _ *connect.Request[mybackendv1.GetCurrentUserRequest]) (*connect.Response[mybackendv1.GetCurrentUserResponse], error) {
	return connect.NewResponse(&amp;mybackendv1.GetCurrentUserResponse{
		UserId:    proto.String(&quot;12345&quot;),
		FullName:  proto.String(&quot;John Doe&quot;),
		Email:     proto.String(&quot;john.doe@example.com&quot;),
		AvatarUrl: proto.String(&quot;https://example.com/avatar.jpg&quot;),
	}), nil
}

func main() {
	mux := http.NewServeMux()
	mux.Handle(mybackendv1connect.NewBackendServiceHandler(&amp;backendService{}))
	log.Println(&quot;Serving on http://localhost:8080&quot;)
	if err := http.ListenAndServe(&quot;localhost:8080&quot;, mux); err != nil &amp;&amp; err != http.ErrServerClosed {
		log.Fatalf(&quot;failed to start server: %v&quot;, err)
	}
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This is of course just a mockup of the real thing. In a real scenario we’d probably have the backend inspect a user’s cookie and look up the authenticated user in a database of some kind, or reject the request with an error if the cookie isn’t authenticated. The server can be run using:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ go run main.go
Serving on http://localhost:8080&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If we wanted to verify that the server is running, we can just use &lt;code&gt;cURL&lt;/code&gt;, since Connect supports regular HTTP/1.1 and the JSON content-type:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ curl \
	-d &apos;{}&apos; \
	-H &apos;Content-Type: application/json&apos; \
	-X POST \
	http://localhost:8080/mycorp.mybackend.v1.BackendService/GetCurrentUser 
{&quot;userId&quot;:&quot;12345&quot;, &quot;fullName&quot;:&quot;John Doe&quot;, &quot;email&quot;:&quot;john.doe@example.com&quot;, &quot;avatarUrl&quot;:&quot;https://example.com/avatar.jpg&quot;}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Note how the method is &lt;em&gt;always&lt;/em&gt; POST, and the path is a combination of Protobuf package, service name and RPC name. Now we are ready to implement the frontend!&lt;/p&gt;&lt;h3&gt;The frontend&lt;/h3&gt;&lt;p&gt;As mentioned before, Connect has &lt;a href=&quot;https://github.com/connectrpc/examples-es&quot;&gt;examples for several different frontend frameworks&lt;/a&gt;, but we’ll use React for this example, which is also what we use at Oblique. Let’s start with a basic frontend skeleton:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;app/
	node_modules/
		...
	dist/
		index.html
	index.tsx
	package.json
	package-lock.json&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Use a package manager of your choice to initialize &lt;code&gt;package.json&lt;/code&gt; and &lt;code&gt;package-lock.json&lt;/code&gt;. Here is &lt;code&gt;dist/index.html&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;UTF-8&quot; /&gt;
    &lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1.0&quot; /&gt;
    &lt;meta http-equiv=&quot;X-UA-Compatible&quot; content=&quot;ie=edge&quot; /&gt;
    &lt;title&gt;My app&lt;/title&gt;
    &lt;script src=&quot;index.js&quot; defer&gt;&lt;/script&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;div id=&quot;root&quot;&gt;&lt;/div&gt;
  &lt;/body&gt;
&lt;/html&gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And here is &lt;code&gt;index.tsx&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;import React from &quot;react&quot;;
import ReactDOM from &quot;react-dom/client&quot;;

const App: React.FC = () =&gt; {
  return &lt;div&gt;Hello world!&lt;/div&gt;;
};

ReactDOM.createRoot(document.getElementById(&quot;root&quot;) as HTMLElement).render(
  &lt;React.StrictMode&gt;
    &lt;App /&gt;
  &lt;/React.StrictMode&gt;,
);&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;To turn our TypeScript file into an &lt;code&gt;index.js&lt;/code&gt; that the browser can interpret and execute, we’ll use the &lt;code&gt;esbuild&lt;/code&gt; bundler. You can use any bundler of your choice for this step, &lt;code&gt;esbuild&lt;/code&gt; is not mandatory.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ go install github.com/evanw/esbuild/cmd/esbuild@latest
$ cd app &amp;&amp; esbuild --bundle --outdir=dist index.tsx&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This will produce our &lt;code&gt;dist/index.js&lt;/code&gt;. With this &lt;code&gt;dist&lt;/code&gt; folder setup, we can serve the app using any HTTP file server, but we’ll use our Go server to serve the app because it’s easy, and avoids the need for &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS&quot;&gt;CORS&lt;/a&gt;. One way to do that is to drop an &lt;code&gt;app.go&lt;/code&gt; file into the &lt;code&gt;app&lt;/code&gt; folder and then embed the &lt;code&gt;dist&lt;/code&gt; folder:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;package app

import &quot;embed&quot;

//go:embed dist
var Dist embed.FS&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This can be imported as a Go package with the &lt;code&gt;Dist&lt;/code&gt; variable being an in-memory embedding of the contents of the &lt;code&gt;dist&lt;/code&gt; folder. Lets modify &lt;code&gt;main.go&lt;/code&gt; to import this file and serve it when not serving API requests:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;package main

import (
	&quot;context&quot;
	&quot;log&quot;
	&quot;net/http&quot;

	&quot;connectrpc.com/connect&quot;
	&quot;github.com/mycorp/myproject/app&quot;
	mybackendv1 &quot;github.com/mycorp/myproject/gen/mycorp/mybackend/v1&quot;
	&quot;github.com/mycorp/myproject/gen/mycorp/mybackend/v1/mybackendv1connect&quot;
	&quot;google.golang.org/protobuf/proto&quot;
)

// backend service definitions, omitted for brevity

func main() {
	mux := http.NewServeMux()
	mux.Handle(mybackendv1connect.NewBackendServiceHandler(&amp;backendService{}))
	distFS, err := fs.Sub(app.Dist, &quot;dist&quot;)
	if err != nil {
		log.Fatalf(&quot;failed to sub dist from embedded filesystem: %v&quot;, err)
	}
	mux.Handle(&quot;/&quot;, http.FileServerFS(distFS))
	log.Println(&quot;Serving on http://localhost:8080&quot;)
	if err := http.ListenAndServe(&quot;localhost:8080&quot;, mux); err != nil &amp;&amp; err != http.ErrServerClosed {
		log.Fatalf(&quot;failed to start server: %v&quot;, err)
	}
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now if we serve the backend and visit &lt;a href=&quot;http://localhost:8080/&quot;&gt;http://localhost:8080&lt;/a&gt; we are greeted with our rendered React app.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/5b96ae0b8bd9b8f676089114cec7ed9e607f3a4c-982x730.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Not the prettiest web app ever made, but it gets the job done. Now that we’ve hosted the frontend, lets try generating the client to let it talk to the backend over Connect. The Protobuf plugin we are looking for is called &lt;code&gt;protoc-gen-es&lt;/code&gt;, and there are a few different ways to use it. The easiest way for open-source or non-sensitive code is to use the &lt;a href=&quot;https://buf.build/bufbuild/es?version=v2.10.2&quot;&gt;Buf Schema Registry’s remote plugin&lt;/a&gt;. Note that this will send your protobuf files to Buf’s servers to allow them to generate your code for you, so don’t do this for sensitive or private code unless you know that this is OK. For this example, we will use a local plugin installed into our &lt;code&gt;node_modules&lt;/code&gt; folder, but we should understand that managing local Protobuf plugins can introduce hard-to-debug issues for local users, such as when the wrong version of a plugin is being used.&lt;/p&gt;&lt;p&gt;Install the npm package &lt;code&gt;@bufbuild/protoc-gen-es&lt;/code&gt; using your favorite package manager. This will create a file that we can execute using a shell, &lt;code&gt;node_modules/.bin/protoc-gen-es&lt;/code&gt;. We can use this file as a protobuf plugin through our &lt;code&gt;buf.yaml&lt;/code&gt; file:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;version: v2
plugins:
  - local: protoc-gen-go
    out: gen
    opt:
      - paths=source_relative
  - local: protoc-gen-connect-go
    out: gen
    opt:
      - paths=source_relative
  - local: ./app/node_modules/.bin/protoc-gen-es
    out: app/gen/
    opt:
      - target=ts&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Use the option &lt;code&gt;target=ts&lt;/code&gt; to generate a TypeScript file. Regenerate the files:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ buf generate&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This will create a new file, &lt;code&gt;app/gen/mycorp/mybackend/v1/backend_pb.ts&lt;/code&gt;. Again, note how the file path is based on the Protobuf file structure. The generated file exposes some types and values but is mostly useless on its own. To make use of it, we need to add some more npm packages:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;code&gt;@bufbuild/protobuf&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;code&gt;@connectrcp/connect&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;code&gt;@connectrcp/connect-web&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Now we can update our minimal frontend app to make use of the backend over Connect:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;import { createClient } from &quot;@connectrpc/connect&quot;;
import { createConnectTransport } from &quot;@connectrpc/connect-web&quot;;
import React, { useState } from &quot;react&quot;;
import ReactDOM from &quot;react-dom/client&quot;;
import {
  BackendService,
  GetCurrentUserResponse,
} from &quot;./gen/mycorp/mybackend/v1/backend_pb&quot;;

const client = createClient(
  BackendService,
  createConnectTransport({
    baseUrl: &quot;.&quot;,
  }),
);

const App: React.FC = () =&gt; {
  const [user, setUser] = useState&lt;GetCurrentUserResponse | null&gt;(null);

  const onClick = async () =&gt; {
    const response = await client.getCurrentUser({});
    setUser(response);
  };

  return (
    &lt;div&gt;
      &lt;h1&gt;User Dashboard&lt;/h1&gt;
      &lt;button onClick={onClick}&gt;Get Current User&lt;/button&gt;
      {user &amp;&amp; (
        &lt;div
          style={{
            marginTop: &quot;20px&quot;,
            border: &quot;1px solid #ccc&quot;,
            padding: &quot;10px&quot;,
          }}
        &gt;
          &lt;h3&gt;User Details&lt;/h3&gt;
          &lt;p&gt;
            &lt;strong&gt;ID:&lt;/strong&gt; {user.userId}
          &lt;/p&gt;
          &lt;p&gt;
            &lt;strong&gt;Name:&lt;/strong&gt; {user.fullName}
          &lt;/p&gt;
          &lt;p&gt;
            &lt;strong&gt;Email:&lt;/strong&gt; {user.email}
          &lt;/p&gt;
          &lt;p&gt;
            &lt;strong&gt;Avatar URL:&lt;/strong&gt; {user.avatarUrl}
          &lt;/p&gt;
        &lt;/div&gt;
      )}
    &lt;/div&gt;
  );
};

ReactDOM.createRoot(document.getElementById(&quot;root&quot;) as HTMLElement).render(
  &lt;React.StrictMode&gt;
    &lt;App /&gt;
  &lt;/React.StrictMode&gt;,
);&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There’s a lot going on here, so lets break it down. First, we create a client:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const client = createClient(
  BackendService,
  createConnectTransport({
    baseUrl: &quot;.&quot;,
  }),
);&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This client uses the generated &lt;code&gt;BackendService&lt;/code&gt; value and some TypeScript magic to forward the type information. The baseUrl is &lt;code&gt;.&lt;/code&gt; since we’re using the Go backend both for hosting the app and the API.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const [user, setUser] = useState&lt;GetCurrentUserResponse | null&gt;(null);

const onClick = async () =&gt; {
  const response = await client.getCurrentUser({});
  setUser(response);
};&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This is where we make the call to the backend. The &lt;code&gt;{}&lt;/code&gt; parameter to &lt;code&gt;getCurrentUser&lt;/code&gt; is because there are currently no fields on the input request to &lt;code&gt;getCurrentUser&lt;/code&gt;. Naturally, the function is &lt;code&gt;async&lt;/code&gt;, so we use &lt;code&gt;await&lt;/code&gt; to get the response, so that we can set it on &lt;code&gt;useState&lt;/code&gt;.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;return (
  &lt;div&gt;
    &lt;h1&gt;User Dashboard&lt;/h1&gt;
    &lt;button onClick={onClick}&gt;Get Current User&lt;/button&gt;
  &lt;/div&gt;
);&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We hook up the &lt;code&gt;onClick&lt;/code&gt; handler to a &lt;code&gt;button&lt;/code&gt; to trigger the call to the backend.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{user &amp;&amp; (
  &lt;div
    style={{
      marginTop: &quot;20px&quot;,
      border: &quot;1px solid #ccc&quot;,
      padding: &quot;10px&quot;,
    }}
  &gt;
    &lt;h3&gt;User Details&lt;/h3&gt;
    &lt;p&gt;
      &lt;strong&gt;ID:&lt;/strong&gt; {user.userId}
    &lt;/p&gt;
    &lt;p&gt;
      &lt;strong&gt;Name:&lt;/strong&gt; {user.fullName}
    &lt;/p&gt;
    &lt;p&gt;
      &lt;strong&gt;Email:&lt;/strong&gt; {user.email}
    &lt;/p&gt;
    &lt;p&gt;
      &lt;strong&gt;Avatar URL:&lt;/strong&gt; {user.avatarUrl}
    &lt;/p&gt;
  &lt;/div&gt;
)}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We conditionally render the user information when the user is available (once the backend function call has returned). Re-bundle:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ cd app &amp;&amp; esbuild --bundle --outdir=dist index.tsx&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The final website look:&lt;/p&gt;&lt;div style=&quot;display:none&quot;&gt;Unknown block type &quot;videoEntry&quot;, specify a component for it in the `components.types` option&lt;/div&gt;&lt;h3&gt;Making changes&lt;/h3&gt;&lt;p&gt;Whew, that was a lot! As we mentioned at the start, there is quite a bit of boilerplate to get everything set up, but now that all that work is done, making changes, or even adding new APIs becomes &lt;em&gt;so much easier&lt;/em&gt; that it’ll all have been worth it. Lets try adding a new input parameter to the &lt;code&gt;GetCurrentUser&lt;/code&gt; RPC:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Rest of Protobuf file omitted for brevity
message GetCurrentUserRequest {
	uint32 avatar_option = 1;
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We then regenerate the files:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ buf generate&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If we look at our frontend, we can see that we now have the option to specify an &lt;code&gt;avatarOption&lt;/code&gt; field in &lt;code&gt;getCurrentUser&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const onClick = async () =&gt; {
  const response = await client.getCurrentUser({
    avatarOption: 1,
  });
  setUser(response);
};&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The TypeScript compiler helpfully tells us that the type of this field is &lt;code&gt;number&lt;/code&gt;, so we can’t accidentally send a &lt;code&gt;string&lt;/code&gt; or something else nonsensical. Don’t forget to re-bundle the frontend:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;$ cd app &amp;&amp; esbuild --bundle --outdir=dist index.tsx&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;On the backend side, we implement support for a few different avatar options:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;func (s *backendService) GetCurrentUser(ctx context.Context, req *connect.Request[mybackendv1.GetCurrentUserRequest]) (*connect.Response[mybackendv1.GetCurrentUserResponse], error) {
	resp := &amp;mybackendv1.GetCurrentUserResponse{
		UserId:    proto.String(&quot;12345&quot;),
		FullName:  proto.String(&quot;John Doe&quot;),
		Email:     proto.String(&quot;john.doe@example.com&quot;),
		AvatarUrl: proto.String(&quot;https://example.com/avatar.jpg&quot;),
	}
	switch req.Msg.GetAvatarOption() {
	case 0:
		// Default avatar
	case 1:
		resp.AvatarUrl = proto.String(&quot;https://example.com/avatar1.jpg&quot;)
	case 2:
		resp.AvatarUrl = proto.String(&quot;https://example.com/avatar2.jpg&quot;)
	default:
		return nil, connect.NewError(connect.CodeInvalidArgument, nil)
	}
	return connect.NewResponse(resp), nil
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now we can run the website again and see that the backend will return the avatar corresponding to option 1!&lt;/p&gt;&lt;p&gt;Adding new RPCs and even new services is similarly simple, and with a single source-of-truth for your API definitions, there is never a question of whether the API is exposed by the backend yet, or what the request format looks like. Connect has changed the game by thinking through and &lt;em&gt;following through&lt;/em&gt; on a frontend developer experience that finally gets frontend developers onboard and excited about typed API development.&lt;/p&gt;&lt;h3&gt;Further reading&lt;/h3&gt;&lt;p&gt;If you want to get started with type safe frontend APIs in your own company or project, the &lt;a href=&quot;https://connectrpc.com/docs/introduction&quot;&gt;Connect documentation&lt;/a&gt; has a lot of information to get you started in the language of your choice.&lt;/p&gt;&lt;p&gt;For generic Protobuf API design, &lt;a href=&quot;https://google.aip.dev/&quot;&gt;the Google AIPs&lt;/a&gt; are a great source of advice and rules that you should read. There is &lt;a href=&quot;https://github.com/googleapis/api-linter&quot;&gt;a linter&lt;/a&gt; you can use to ensure you are conformant.&lt;/p&gt;&lt;p&gt;Likewise, &lt;a href=&quot;https://buf.build/docs/best-practices/style-guide/&quot;&gt;the Buf style guide&lt;/a&gt; and buf lint command help you get started and stay conformat with Protobuf and their best practices recommendations.&lt;/p&gt;</content:encoded><author>Johan Brandhorst-Satzkorn</author></item><item><title>The real SSO tax</title><link>https://oblique.security/blog/real-sso-tax/</link><guid isPermaLink="true">https://oblique.security/blog/real-sso-tax/</guid><description>The SSO tax shouldn&apos;t be about having SSO — it should be about enforcing it. The value of SSO is to centrally manage access and require strong authentication.</description><pubDate>Mon, 08 Dec 2025 18:58:40 GMT</pubDate><content:encoded>&lt;p&gt;We’re preparing for our upcoming SOC 2 audit here at Oblique. And it’s maddening to realize that it’s 2025, and you &lt;em&gt;still&lt;/em&gt; can’t reasonably ensure that your employees are using strong authentication.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;What your security team really wants is to understand what applications your employees have access to, and enforce that they use strong authentication to access those applications. That’s why SSO exists in the first place! Since we use Google as our IdP and Apple Business Manager as our MDM, our ideal setup would be: use (and enforce) Google SSO everywhere we can — which can prompt for multi factor authentication — and use a passkey bound to the company-managed device where we can’t.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;That sounds reasonable, but it&amp;#x27;s not what happens in reality.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Where apps &lt;em&gt;do&lt;/em&gt; support SSO, you often can&amp;#x27;t enforce that it&amp;#x27;s being used. And where they &lt;em&gt;don&amp;#x27;t &lt;/em&gt;support SSO, you often can&amp;#x27;t ensure strong auth. This gap is the real frustration. The &lt;a href=&quot;https://sso.tax/&quot;&gt;SSO tax&lt;/a&gt; shouldn&amp;#x27;t be about having SSO — it should be about enforcing it. Why did I just spend 15 minutes configuring SAML SSO for our HRIS, if a user can still log in with a username and password? Why did I set up an authenticator app for this login if the app is just going to email me a magic link anyway? We recently had an automated compliance check fail because a user only used a passkey — which doesn’t count as MFA. What is this security theater where we constantly weaken security?&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I don&amp;#x27;t want to pay to have SSO. But I’m willing to pay to enforce the use of SSO.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I understand where this comes from. Password resets and account recovery are huge pains for consumer apps, wasting support teams&amp;#x27; time. For both consumer and enterprise apps, you usually want the user to be able to log back in (otherwise, how are you going to hit your growth targets?). In an organization, account recovery also wastes your IT team&amp;#x27;s time. But the usability of SSO means this shouldn&amp;#x27;t be a problem, unless a user locks themselves out of their main account. In which case, yes, your IT team really should talk to them before resetting anything.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;If you&amp;#x27;re building an enterprise application, please support SSO — and only SSO — so that it&amp;#x27;s easy for your customers&amp;#x27; IT teams to enforce it (because it&amp;#x27;s the only option) and rely on MFA controls in their identity provider. You can skip building:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;MFA that only supports an authenticator app, not a hardware token or TouchID&lt;/li&gt;&lt;li&gt;MFA that only supports SMS or email-based codes&lt;/li&gt;&lt;li&gt;Strong MFA support that allows downgrading to weaker options&lt;/li&gt;&lt;li&gt;Strong MFA support that isn&amp;#x27;t required&lt;/li&gt;&lt;li&gt;Username/password login that requires MFA, but doesn&amp;#x27;t support passkeys&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Not only is this complicated to build and maintain, it&amp;#x27;s actively unhelpful for your customers&amp;#x27; security posture.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;There&amp;#x27;s one notable exception: enterprise apps that require employees to use a personal email for continuity, so they can still access the account after leaving the company — like payroll providers or equity management tools. In those cases, please do provide (and require) strong MFA — and only strong MFA. This is exactly the kind of data that most needs to be protected.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;These identity frustrations aren&amp;#x27;t exciting. But they&amp;#x27;re real, and they&amp;#x27;re exactly the kind of problems we&amp;#x27;re working on at Oblique.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>What we can learn from real-world authentication failures</title><link>https://oblique.security/blog/authn-failures/</link><guid isPermaLink="true">https://oblique.security/blog/authn-failures/</guid><description>Recent breaches at Okta, Snowflake, and Twitter help us learn how to prevent authentication failures like credential theft, MFA bypass, and session hijacking.</description><pubDate>Wed, 15 Oct 2025 17:33:15 GMT</pubDate><content:encoded>&lt;p&gt;&lt;em&gt;This blog post is a written version of a talk that our cofounder Maya gave at BSidesSeattle in April. You can also &lt;a href=&quot;https://www.youtube.com/watch?v=l69xs3ehTQI&quot;&gt;watch the recording&lt;/a&gt; and &lt;a href=&quot;https://github.com/mayakacz/presentation-slides/blob/master/20250418%20-%20BSidesSeattle%20-%20When%20authn%20breaks.pdf&quot;&gt;get the slides&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;Authentication is one of the most critical identity controls we have. In some sense, it&amp;#x27;s both the first and last line of defense. If authentication fails, it means that attackers can bypass all other controls. And since identity is tied to all other systems, the compromise of even a single account could have a wide-reaching impact.&lt;/p&gt;&lt;p&gt;Although there are so many ways that authentication &lt;em&gt;could&lt;/em&gt; go wrong — credential compromise, MFA and recovery weaknesses, session management failures, and poorly implemented protocols — there’s not necessarily that much variance in reality. In this post, we&amp;#x27;ll look at real-world authentication failures from the last five years. Studying these incidents is how our industry learns and improves, as long as we actually apply those lessons.&lt;/p&gt;&lt;h2&gt;Credential compromise&lt;/h2&gt;&lt;p&gt;Credential compromise is the most common type of authentication failure, whether that be password theft, credential stuffing, or a brute force attack. This could also occur via the compromise of an identity provider or password manager.&lt;/p&gt;&lt;p&gt;We know how to prevent most credential compromises: by using multi-factor authentication. Yet, these incidents remain shockingly common and impactful in reality. The real failure isn&amp;#x27;t &amp;quot;credential compromise,&amp;quot; it&amp;#x27;s &amp;quot;no MFA” (let&amp;#x27;s call it what it is).&lt;/p&gt;&lt;h3&gt;LAPSUS$ gains access to Okta’s support system&lt;/h3&gt;&lt;p&gt;In January 2022, LAPSUS$ compromised an Okta third-party support engineer&amp;#x27;s laptop, using it to then access the customer support system. The breach only came to light months later when &lt;a href=&quot;https://x.com/_MG_/status/1506109152665382920&quot;&gt;screenshots of Okta&amp;#x27;s own Okta instance with superuser access were posted on Twitter&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In all, the attacker had access to the individual’s laptop for 5 days, giving them access to 366 customers&amp;#x27; Okta tenants, though seemingly no further damage was done.&lt;/p&gt;&lt;p&gt;Many of the authentication vulnerabilities we&amp;#x27;ve seen in the past few years can be attributed to LAPSUS$ — and they definitely have &lt;a href=&quot;https://www.youtube.com/watch?v=9wpBaXcXQSM&quot;&gt;their preferred playbook&lt;/a&gt;.&lt;/p&gt;&lt;h3&gt;Another compromise of Okta’s support system&lt;/h3&gt;&lt;p&gt;In 2023, an attacker again gained unauthorized access to Okta&amp;#x27;s customer support system, &lt;a href=&quot;https://sec.okta.com/articles/2023/11/unauthorized-access-oktas-support-case-management-system-root-cause/&quot;&gt;affecting 134 of their customers&lt;/a&gt;. The breach occurred when an employee saved their work credentials to their personal Chrome password manager, which was later compromised.&lt;/p&gt;&lt;p&gt;The attacker found that some support chats, logs, and files had valid session tokens, which they were able to use to compromise five of Okta’s customers, including Cloudflare, BeyondTrust, and 1Password.&lt;/p&gt;&lt;h3&gt;Credential stuffing Snowflake customers&lt;/h3&gt;&lt;p&gt;In early 2024, members of online crime-focused chat group The Com targeted Snowflake. Many of Snowflake’s accounts were only protected with a username and password (no second factor). &lt;a href=&quot;https://cloud.google.com/blog/topics/threat-intelligence/unc5537-snowflake-data-theft-extortion&quot;&gt;UNC5537&lt;/a&gt; purchased customers&amp;#x27; credentials on the dark web and systematically took over these accounts.&lt;/p&gt;&lt;p&gt;Up to 165 Snowflake customers were compromised, with attackers stealing phone and text message records for 110 million AT&amp;amp;T customers and leaking 160 thousand Taylor Swift Eras tour barcodes from Ticketmaster. The attackers also extorted dozens of companies in exchange for not releasing their data, &lt;a href=&quot;https://www.404media.co/the-walls-are-closing-in-on-the-snowflake-hacker/&quot;&gt;netting about $2 million&lt;/a&gt; in ransom payments.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://www.snowflake.com/en/blog/multi-factor-identification-default/&quot;&gt;In response, Snowflake made MFA generally available&lt;/a&gt;, requiring it for new accounts — and increased the minimum password length from eight to fourteen characters.&lt;/p&gt;&lt;p&gt;These credential compromises demonstrate that basic authentication still fails spectacularly. But what happens when organizations do implement MFA — and attackers find a way around it?&lt;/p&gt;&lt;h2&gt;MFA and recovery weaknesses&lt;/h2&gt;&lt;p&gt;When basic credential theft fails, attackers pivot to target the second factor directly. MFA and recovery weaknesses include MFA fatigue, implementation issues in recovery flows, and SIM swapping.&lt;/p&gt;&lt;h3&gt;Social engineering to get access to Twitter&amp;#x27;s support tool&lt;/h3&gt;&lt;p&gt;Twitter’s support team used an internal admin tool for common support workflows, such as suspending accounts or changing their recovery emails. Knowing that as many as 1500 Twitter employees had access to such a powerful tool, &lt;a href=&quot;https://en.wikipedia.org/wiki/2020_Twitter_account_hijacking&quot;&gt;attackers scraped LinkedIn for Twitter employee profiles, then called them impersonating the IT department&lt;/a&gt;. The attackers directed their targets to a fake VPN login page and phished their 2FA code, gaining access to Twitter’s internal network, including the admin tool.&lt;/p&gt;&lt;p&gt;In this way, the attackers were able to gain access to about 130 prominent accounts, tweeting scam bitcoin messages from 45 of those accounts. This netted them around $120k in bitcoin payments before the messages were taken down.&lt;/p&gt;&lt;h3&gt;MFA fatigue at Uber&lt;/h3&gt;&lt;p&gt;To get into Uber, LAPSUS$ &lt;a href=&quot;https://www.uber.com/newsroom/security-update/&quot;&gt;obtained VPN credentials for a contractor&lt;/a&gt;, then continuously bombarded that user with MFA prompts while simultaneously contacting them on WhatsApp posing as ‘tech support’. Eventually, the user approved a prompt, allowing the attack to log in. Once on Uber&amp;#x27;s VPN, the attacker had access to the contractors’s accounts on Google Workspace and Slack.&lt;/p&gt;&lt;p&gt;This was the same technique LAPSUS$ used against Microsoft, Nvidia, and Okta. MFA fatigue works because it only takes one approval — whether from exhaustion, confusion, or a simple misclick — to let attackers in.&lt;/p&gt;&lt;h2&gt;Session management failures&lt;/h2&gt;&lt;p&gt;Even with second factors in place, attackers adapt. Some have moved to bypassing authentication entirely, by stealing active sessions — through session hijacking, cookie theft, OAuth token theft, XSS, or CSRF. Once a legitimate user has authenticated and proven their identity, the attacker simply takes over that established session.&lt;/p&gt;&lt;h3&gt;GitHub OAuth token theft&lt;/h3&gt;&lt;p&gt;GitHub issues OAuth tokens for connecting with other systems like CI/CD pipelines. In April 2022, &lt;a href=&quot;https://github.blog/news-insights/company-news/security-alert-stolen-oauth-user-tokens/&quot;&gt;GitHub discovered attackers using stolen Heroku and Travis-CI OAuth tokens to access organizations’ data&lt;/a&gt;. Each token allowed the attackers to access the GitHub API as that user, enumerate their organizations and private repositories, then selectively clone repositories they were interested in.&lt;/p&gt;&lt;p&gt;The attackers then searched private repos for secrets. They found npm&amp;#x27;s AWS infrastructure API key, and proceeded to download all private npm package manifests and metadata, 100k npm users&amp;#x27; hashed passwords, and private packages from two organizations.&lt;/p&gt;&lt;p&gt;OAuth tokens — which often provide broad, persistent access — have become increasingly attractive targets because organizations rarely review which apps employees have authorized or what scopes those apps have been granted.&lt;/p&gt;&lt;h3&gt;Stealing CircleCI sessions and customer tokens&lt;/h3&gt;&lt;p&gt;In January 2023, CircleCI was alerted to an issue when one of their customers noticed suspicious activity for a GitHub OAuth token that was stored in CircleCI. &lt;a href=&quot;https://circleci.com/blog/jan-4-2023-incident-report/&quot;&gt;They discovered an attacker had compromised an engineer&amp;#x27;s laptop with malware, stealing an active 2FA-backed SSO session&lt;/a&gt;. The engineer had production access, which the attacker used to steal customer environment variables, tokens, and keys.&lt;/p&gt;&lt;p&gt;CircleCI issued a notice for its customers to rotate secrets they had stored in CircleCI, and then went a step further, coordinating with other infrastructure providers to invalidate and rotate OAuth tokens for GitHub, Bitbucket, and GitLab.&lt;/p&gt;&lt;h2&gt;Authentication protocol failures&lt;/h2&gt;&lt;p&gt;The last kind of authentication failures we’ll consider are authentication protocol weaknesses. These are rarely weaknesses in protocols, and more often stem from poor implementations, such as OIDC misconfigurations or missing JWT validation.&lt;/p&gt;&lt;h3&gt;ProxyShell vulnerabilities in Microsoft Exchange&lt;/h3&gt;&lt;p&gt;&lt;a href=&quot;https://www.zerodayinitiative.com/blog/2021/8/17/from-pwn2own-2021-a-new-attack-surface-on-microsoft-exchange-proxyshell&quot;&gt;At Pwn2Own Vancouver 2021, researchers discovered vulnerabilities in Microsoft Exchange&lt;/a&gt;. By exploiting an &amp;quot;Explicit Login&amp;quot; feature used for direct access to a user&amp;#x27;s inbox, attackers could manipulate Exchange&amp;#x27;s URL normalization and gain unauthenticated access to Exchange Server internals. Chained with two other CVEs — &lt;a href=&quot;https://cloud.google.com/blog/topics/threat-intelligence/pst-want-shell-proxyshell-exploiting-microsoft-exchange-servers&quot;&gt;collectively called ProxyShell&lt;/a&gt; — this enabled remote code execution on multiple Exchange Server versions, affecting tens of thousands of servers globally.&lt;/p&gt;&lt;p&gt;Microsoft quickly released patches, but about a year later, suspiciously similar attacks were detected. It turns out that the patches were insufficient: Exchange remained vulnerable to server-side request forgery, with two more CVEs — &lt;a href=&quot;https://www.securonix.com/blog/proxynotshell-revisited/&quot;&gt;dubbed ProxyNotShell&lt;/a&gt; — allowing authenticated attackers to bypass two-factor authentication to again achieve RCE.&lt;/p&gt;&lt;h2&gt;Lessons learned&lt;/h2&gt;&lt;p&gt;So, what can we learn and apply from these real-world failures?&lt;/p&gt;&lt;h3&gt;Implementing authentication&lt;/h3&gt;&lt;p&gt;If you’re implementing authentication:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Check for common implementation mistakes&lt;/strong&gt;. These can happen to anyone. Not even sophisticated organizations — like Microsoft, with critical authentication products like Active Directory, or Okta, an identity company — are immune.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Use the right algorithms&lt;/strong&gt;. Double check anything that is particularly sensitive, or touches cryptography, like hashing or input normalization.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Implement session timeouts&lt;/strong&gt;. Limit the impact of session hijacking by restricting how long a token or credential is valid for.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Prevent brute force attacks&lt;/strong&gt;. Implement rate limits and log authentication attempts to avoid password spray or mass credential stuffing attacks.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Provide users with more than just basic auth&lt;/strong&gt;. Implement SSO, MFA, and strong controls for admin or other sensitive actions.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Using authentication&lt;/h3&gt;&lt;p&gt;So many of the issues we’ve seen in the last few years have very much followed a playbook — often LAPSUS$’s playbook. &lt;strong&gt;If there’s only one thing beyond the basics that you should really be doing, it’s rolling out strong phishing-resistant MFA, like hardware tokens or passkeys.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The basics still matter most:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Implement MFA&lt;/strong&gt;. You should really do this!&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Require the use of strong MFA&lt;/strong&gt;. Use hardware or WebAuthn-based MFA.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Educate your users&lt;/strong&gt;. Warn employees about attacks like MFA fatigue, and make sure they know how to report suspicious activity.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Set up monitoring&lt;/strong&gt;. Alert on suspicious authentication events like MFA failures, password resets, and MFA enrollment for new devices.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Beyond MFA, you should also:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Review OAuth tokens in your environment&lt;/strong&gt;. Check which integrations are authorized, what scopes they have, and when tokens were last rotated. Treat this as an ongoing process, not a one-time audit.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Prevent employees from using personal password managers&lt;/strong&gt;. Provide them with a corporate password manager.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Pay attention to contractor access&lt;/strong&gt;. Secure how non-employees, including those in support, access critical systems. This includes endpoint protection or browser controls to detect keylogging or infostealing malware.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Finally, assume compromise will happen and prepare accordingly. Design for defense in depth, implement logging and monitoring, and when a breach does occur, move quickly. Lastly, please share what you can — that&amp;#x27;s how the industry learns and improves.&lt;/p&gt;&lt;p&gt;Authentication failures continue to be a primary vector for major security incidents. Our best defense is learning from the past and preparing for what seems inevitable.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Internal security tools rarely get the UX they deserve</title><link>https://oblique.security/blog/security-ux/</link><guid isPermaLink="true">https://oblique.security/blog/security-ux/</guid><description>Security teams underestimate the investment needed for internal tools, and so underinvest in UX. When security tools are painful to use, people bypass security.</description><pubDate>Tue, 07 Oct 2025 16:18:43 GMT</pubDate><content:encoded>&lt;p&gt;Building internal tools is hard. Not necessarily &lt;em&gt;technically&lt;/em&gt; hard, but it’s hard to build tools that people actually want to use. Security teams are good at creating access request portals, compliance dashboards, and exception workflows that technically work, but that are painful for our users — and then we wonder why these controls and processes get circumvented. When we build internal security tools, we consistently underestimate the time, investment, and ongoing maintenance needed to build something that people actually want to use.&lt;/p&gt;&lt;p&gt;We’re thinking about this constantly as we build access management tools at Oblique. We&amp;#x27;ve experienced firsthand how challenging it is to create security tools that people actually embrace rather than reluctantly tolerate. An employee isn&amp;#x27;t necessarily thinking of the security implications when they share superadmin creds with a teammate who urgently needs to debug an issue, or when they download a sensitive financial report to a USB key to get printed at FedEx — they&amp;#x27;re just trying to get their work done when the approved path is too slow or unclear. And yes, I’ve done both of these things.&lt;/p&gt;&lt;p&gt;Whether you’re building because there isn’t a good enough solution on the market or because your environment is unique (or because you &lt;em&gt;think&lt;/em&gt; it’s unique), the challenge remains the same. Before you commit to building — or deploying — a new security tool, step back and consider the full cost: you&amp;#x27;re committing to UX work, user support, integrations with every tool in your environment, and maintenance that will outlast the engineer who built it.&lt;/p&gt;&lt;h2&gt;Security engineers aren’t the best product engineers&lt;/h2&gt;&lt;p&gt;I’ll say the quiet part out loud: security engineers aren’t the best product engineers. (But that’s okay, because that’s not why you hired them!) Security teams now hire engineers who can build secure systems, not just analysts who review them. But engineering secure systems and designing intuitive interfaces are different skills — and most security engineers were hired for the former, not the latter.&lt;/p&gt;&lt;p&gt;Most security teams have limited engineering resources that need to support many teams: analysts, compliance, and IT. These engineers often focus on building core infrastructure and backend systems — with less experience in frontend and user experience. (This isn’t to say full stack snowflakes don’t exist, they’re just rare.)&lt;/p&gt;&lt;p&gt;When security teams &lt;em&gt;do&lt;/em&gt; build tools, just like with any other development, the timeline estimation is completely off. We estimate the time required based on the core functionality we need, but forget about everything else: monitoring and logging, integrations with the IdP and SIEM, and testing. We struggle just to ensure coverage across internal systems — there&amp;#x27;s no time left for polish. UX improvements, &lt;a href=&quot;https://breanneboland.com/blog/2022/06/04/bsides-sf-2022-read-the-fantastic-manual/&quot;&gt;documentation&lt;/a&gt;, and internal training all get cut.&lt;/p&gt;&lt;p&gt;Internal teams also don&amp;#x27;t generally get the full support you need to build a product: they don’t get product managers, designers, or docs writers. And if you do get allocated this help, you’re lucky to get a passionate volunteer. PMs working on internal tools tend to be evaluated against the same criteria as those shipping customer-facing products, putting them at a disadvantage for promotions and limiting career growth.&lt;/p&gt;&lt;h2&gt;Usability matters for security&lt;/h2&gt;&lt;p&gt;OK, everyone builds terrible internal tools — so why pick on security? Because &lt;a href=&quot;https://www.paloaltonetworks.com/blog/2025/07/security-by-design-ux-ai-modern-cybersecurity/&quot;&gt;usability actually matters in how effective your security program is&lt;/a&gt;. When our tools are too painful to use, people develop workarounds. An access request workflow that &lt;a href=&quot;/blog/justification-fields/&quot;&gt;requires justifications that are never read&lt;/a&gt; and time-consuming approvals from multiple people teaches engineers to just share credentials instead.&lt;/p&gt;&lt;p&gt;A solution that&amp;#x27;s ‘good enough’ with unnecessary or unclear steps isn&amp;#x27;t &lt;em&gt;actually&lt;/em&gt; good enough. It becomes an ongoing support burden: constant tickets for help because users can&amp;#x27;t figure out basic workflows, and requests for new functionality or integrations with new systems. This gets worse when the person who originally built the tool has moved on, and no one else understands how the black box works or how to modify it.&lt;/p&gt;&lt;p&gt;Security teams also deal with high-risk, high-stress workflows — do you really want someone to cause an outage, miss a critical piece of information, or &lt;a href=&quot;https://en.wikipedia.org/wiki/2018_Hawaii_false_missile_alert&quot;&gt;accidentally send an alert because the UX wasn’t clear&lt;/a&gt;?&lt;/p&gt;&lt;h2&gt;Choosing where to invest&lt;/h2&gt;&lt;p&gt;When you do decide to build, invest properly from the start: think about UX, documentation, training or education, and exceptions. Explicitly hire for frontend and UX experience on the security team, and look for engineers with a product mindset or experience building for end users. Plan for integration work, ongoing maintenance, and support. Actually test your tool with your users and gather feedback. Track adoption, retention, and support tickets. If you&amp;#x27;re getting more support tickets than usage after launch, the tool isn&amp;#x27;t working. (Basically, act like it’s a real product — because to your users, it is!) If you don’t make that investment, it’ll all be a waste of engineering time.&lt;/p&gt;&lt;p&gt;And if you’re not willing or able to make that investment, realize you’re creating debt that you’ll still have to address later. Your engineering time is too valuable to waste on tools that people actively avoid using.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Modern access controls: takeaways on what actually works</title><link>https://oblique.security/blog/policies-report/</link><guid isPermaLink="true">https://oblique.security/blog/policies-report/</guid><description>We interviewed IT and security teams to ask them how they actually define, implement, and improve their access control policies. Get the report to learn more.</description><pubDate>Wed, 24 Sep 2025 20:40:25 GMT</pubDate><content:encoded>&lt;p&gt;As we build access control policies into Oblique, it’s been interesting talking to IT and security teams to get examples of what they’d like to be able to implement. They’re shockingly… boring and unoriginal? Access management is one of those areas where organizations converge on similar approaches, because everyone is working with the same constraints. Even diverse organizations are generally protecting similar types of assets, dealing with similar compliance requirements, and working with similar tech stacks.&lt;/p&gt;&lt;p&gt;So, we wanted to document what IT and security teams have actually implemented today, in a new report on &lt;a href=&quot;https://cdn.sanity.io/files/dlxnfmjc/production/6d9fabd34881379ae3bae0c1d13f563c90db6ad4.pdf&quot;&gt;Modern access control policies&lt;/a&gt;, based on interviews with leaders from organizations with 125 to 5000+ employees. Here are the takeaways:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Policy ownership is shifting from security-only to being jointly owned by security, IT, and the business. &lt;/strong&gt;While security teams historically defined policies and IT implemented them, organizations are moving toward shared ownership. More importantly, they&amp;#x27;re delegating authority to business teams, by involving them in defining requirements, setting policies, and handling approvals — rather than IT or security teams who don&amp;#x27;t have the context.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Controls should apply to data, not systems. &lt;/strong&gt;The strictest controls should protect the most important assets: customer environments, customer data, and data within the scope of a compliance framework. But instead of focusing on protecting systems, teams are starting to think more holistically about data classification — even if that’s just distinguishing between systems that do and don’t have customer data — and protect resources based on the kinds of data they have.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;More complex policy requirements are enforced when access is changed. &lt;/strong&gt;Access requirements can be enforced both when access is used, and when access is changed. Zero trust policies focus on real-time context around users and devices when allowing access to a resource. But the more nuanced requirements kick in when access is granted, modified, or renewed. These decisions require business context from HR systems and validation of requirements that simply aren&amp;#x27;t available at the time of access.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Moving approvals from IT to managers improves speed initially, but human approvals remain a bottleneck. &lt;/strong&gt;Delegating access decisions to managers notably improves the speed of an approval. But this initial unblock isn’t the end state. It’s often &lt;em&gt;still&lt;/em&gt; too slow, and weakens security, since managers will approve any request to unblock their team. The real security and speed improvements come from asking those with context for approvals — app owners — and automating approvals entirely where requests are consistently approved.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Pre-approving specific groups for access is necessary when implementing approval workflows&lt;/strong&gt;. To successfully reduce standing access, users need a way to regain that access validly without a cumbersome process. How can this ease of use be balanced with minimizing risk and cost? Pre-approve a set of users who can invoke access when they need it. This approach works well for common patterns, like engineers accessing production during on-call periods, or support staff accessing customer data with valid ticket numbers.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This report reflects what organizations are actually doing, not what they aspire to implement. Real-world organizations are, well, realistic. They need to balance security requirements while still keeping their organizations functional.&lt;/p&gt;&lt;p&gt;If you’re deciding what to implement in your organization, your peers have already figured out what works. &lt;a href=&quot;https://cdn.sanity.io/files/dlxnfmjc/production/6d9fabd34881379ae3bae0c1d13f563c90db6ad4.pdf&quot;&gt;Read the report&lt;/a&gt;, learn from their experiences, and apply the same patterns to your organization.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Delegate authority to those with context</title><link>https://oblique.security/blog/delegate-authority/</link><guid isPermaLink="true">https://oblique.security/blog/delegate-authority/</guid><description>Business teams have context for access decisions but lack authority. Delegate to those closest to the resources by defining clear ownership for each app.</description><pubDate>Wed, 17 Sep 2025 18:11:58 GMT</pubDate><content:encoded>&lt;p&gt;Early in my career, I thought the hard problems in security were the technical ones. Figure out the right architecture, pick a modern cipher suite, write it all in a memory-safe language — that&amp;#x27;s the important stuff, right? I grew up. People and process problems are always the hardest to solve, &lt;em&gt;especially&lt;/em&gt; at scale. And access management is a perfect example of this.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;IT and security teams don&amp;#x27;t want to be gatekeepers — they want to enable the business to move quickly. But if every access change, every application exception, every device approval needs to go through them... then they are gatekeepers. As it relates to access, the problem is that IT often lacks the context to make the right decisions quickly: who is Bob and whose app is it anyway? Meanwhile, business teams have the context but are stuck filing tickets. The solution isn&amp;#x27;t more people reviewing tickets — it&amp;#x27;s giving the right people the ability to make access decisions directly.&lt;/p&gt;&lt;h2&gt;We&amp;#x27;ve already started delegating security&lt;/h2&gt;&lt;p&gt;Engineering figured this out years ago: we have CODEOWNERS for repositories, on-call rotations for specific services, and now developer portals to help consolidate all of that information in one place. Security has learned this lesson too: when we find a security issue, we need to know who’s responsible for it, so that we can ask them to fix it. Security teams face the same ownership question everywhere: which team owns this vulnerable dependency? This unpatched server? This open firewall rule?&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We&amp;#x27;ve had the most success with code. Finding bugs centrally and assigning them to dev teams was a disaster, so we &amp;quot;shifted left&amp;quot; and made engineering teams own their vulnerabilities. It&amp;#x27;s not perfect, but it works better: developers know their code and can prioritize and fix the vulnerabilities they find. We’ve already been delegating the &lt;em&gt;work&lt;/em&gt; of fixing security issues, but we’re just starting to give these teams the control over what they prioritize.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Corporate security is also heading in this direction. A few years ago, &lt;a href=&quot;https://www.kolide.com/&quot;&gt;Kolide&lt;/a&gt; innovated in endpoint management by delegating responsibility to end users to install security updates or turn on disk encryption. Instead of IT managing MDM exceptions through tickets, users got direct notifications about security issues on their devices, and were often able to resolve them themselves, without ever opening a ticket. This kind of delegation is possible because it&amp;#x27;s clear who owns issues for a given laptop: the person signed into it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;More and more tools are embracing this model: &lt;a href=&quot;https://northpole.security/&quot;&gt;Workshop&lt;/a&gt; lets teams collectively attest that specific binaries should be allowlisted on their devices, rather than waiting for centralized approval. The team that actually needs Android development tools is better positioned to justify that need than someone in IT who&amp;#x27;s never built an Android app.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;The logical next step is applying this same principle to access management — and some organizations already are.&lt;/p&gt;&lt;h2&gt;Distributed access control isn&amp;#x27;t new&lt;/h2&gt;&lt;p&gt;The reality is that access management is already distributed, whether you know it or not. Most SaaS apps are already managed in a decentralized way. The marketing team set up the CMS, so someone on the marketing team (you&amp;#x27;re not sure who) controls access and invites new colleagues. Some SaaS tools like Slack and Figma let you join automatically if you&amp;#x27;re part of the same domain, to make it easier for teams to work together (and to grow that per-seat revenue). This happens in smaller organizations before you centralize control with an SSO provider, but even then, there&amp;#x27;s always shadow IT or apps that don&amp;#x27;t support SSO.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Delegating authority in access management isn&amp;#x27;t a new idea. This is just &lt;a href=&quot;https://oblique.security/resources/access-control-models/&quot;&gt;discretionary access control&lt;/a&gt;: every resource has an owner, and that owner decides who gets access. It&amp;#x27;s exactly how sharing a Google doc works.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;When we started building Oblique, we found that many tech-forward organizations had already built internal access management systems around this principle. Every single one had ownership baked in — they needed to know who was responsible for each app, service, resource, or group so that they knew who to bother when someone needed access or when compliance required an access review.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;App or service owners handle the actual access decisions, including inviting, approving, or reviewing access for what they &amp;quot;own.&amp;quot; They&amp;#x27;ve been delegated both the responsibility and the decision-making authority to decide who actually needs access. This isn’t a title, but a responsibility. And this isn&amp;#x27;t about hierarchy, but about proximity to the resource. A senior engineer who built the service is a better choice than a VP who&amp;#x27;s never used it. Importantly, you need multiple owners — if your single owner goes on parental leave or quits, you&amp;#x27;re back to having IT make decisions about things they&amp;#x27;ve never touched. Not only does IT lack context on what this resource is, they have even &lt;em&gt;less&lt;/em&gt; context than usual since they haven&amp;#x27;t been involved in the recent access decisions.&lt;/p&gt;&lt;h2&gt;Delegation is how security teams scale&lt;/h2&gt;&lt;p&gt;Successful delegation means giving teams authority, not just responsibility. Let the head of Sales decide who gets CRM access: they understand what closes deals, know the cost of additional seats, and have both the business context and financial incentive to make good decisions.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Delegating isn’t easy. It’s scary: what if something goes wrong? But just like a new manager learns to give away more and more control — in a safe environment — security teams are learning to give away more and more responsibility. This shouldn’t be done blindly, but with guardrails to prevent disasters, and accepting that there will be exceptions.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Acknowledging that security can&amp;#x27;t do &lt;em&gt;everything&lt;/em&gt; is part of growing up as an organization scales. Just like in application security or other parts of corporate security, successful delegation in access management means identifying owners, defining their responsibilities — and most importantly, giving them the authority to make decisions that work for them.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Stop trying to make git happen</title><link>https://oblique.security/blog/git-internal-tools/</link><guid isPermaLink="true">https://oblique.security/blog/git-internal-tools/</guid><description>Internal tools built as code come with version control and audit logs for free, but git becomes a barrier for non-engineers to use these tools.</description><pubDate>Wed, 10 Sep 2025 16:55:24 GMT</pubDate><content:encoded>&lt;p&gt;In the past few years, engineers have really come to embrace &lt;a href=&quot;https://en.wikipedia.org/wiki/Infrastructure_as_code&quot;&gt;“as code”&lt;/a&gt; or &lt;a href=&quot;https://www.gitops.tech/&quot;&gt;GitOps&lt;/a&gt; workflows. And what’s not to love: you can stay in an editor you’re familiar with, and preview, test, and deploy changes automatically. This has been popularized for many things developers need to manage — we’ve seen infrastructure as code, config as code, and policy as code. Security teams have gravitated toward these workflows too.&lt;/p&gt;&lt;p&gt;It’s no surprise security teams love these workflows too: you get version control, reviews, rollbacks, and audit logs… for free! If the only user of your tool is the security team, this might be reasonable. But if your internal tool needs to be used by anyone who isn&amp;#x27;t an engineer, it needs to be usable for everyone — and as code workflows aren’t for everyone.&lt;/p&gt;&lt;h2&gt;Not everyone can use git&lt;/h2&gt;&lt;p&gt;As code workflows become a barrier as soon as someone outside the engineering org needs to use an as code tool as a part of their job. Do you think everyone outside of engineering — in sales, marketing, support, product, HR, or ops…&lt;/p&gt;&lt;ul&gt;&lt;li&gt;… has a GitHub account?&lt;/li&gt;&lt;li&gt;… knows Markdown?&lt;/li&gt;&lt;li&gt;… can open a PR?&lt;/li&gt;&lt;li&gt;… can set up a dev environment, even if it’s a devcontainer or codespace?&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;No.&lt;/p&gt;&lt;p&gt;I &lt;em&gt;worked at GitHub&lt;/em&gt;, and I still don’t feel comfortable using git most of the time, even by referencing &lt;a href=&quot;https://jvns.ca/blog/2024/04/25/new-zine--how-git-works-/&quot;&gt;excellent documentation&lt;/a&gt;. I refuse to use the git CLI. And I would rather throw my computer out the window than figure out how to rebase my branch. (I shouldn’t litter my neighbor’s yard, so what I often do is just start a new PR.) I’m a more technical user — and if &lt;em&gt;I &lt;/em&gt;can’t use git, it’s just not reasonable to ask your employees to.&lt;/p&gt;&lt;p&gt;Forcing managers to update permissions in git, or HR to create new accounts in a YAML template can’t be solved with extensive training — realistically, you’re looking at constant support requests. They’ll come to you every time they need help with that task, because you haven’t given them the tools they need to do it themselves.&lt;/p&gt;&lt;p&gt;You might even try to build a UI on top of your as code system, which means keeping context in sync between code and a user interface. Now you’re juggling UI changes and git branches, generating PRs through bots for basic edits, and still don’t have a good flow when a reviewer wants to tweak the generated code. Congrats! You’re now using git as a bad relational database.&lt;/p&gt;&lt;h2&gt;Low-code tools exist for a reason&lt;/h2&gt;&lt;p&gt;We have lots of internal tools that less technical users need to use — or build. We’ve seen an explosion in popularity in low-code or no-code portals for these tools (and now, LLM-enabled workflows). People who want workflow automation don’t necessarily code, and people who need to use those workflows also don’t necessarily code.&lt;/p&gt;&lt;p&gt;AI has certainly made this easier, but not necessarily better. Claude means that I can now generate a PR against an as code system for an access request, but is a critical everyday workflow really what you want me to be vibing?&lt;/p&gt;&lt;p&gt;So why do we keep building internal tools that require git? Often, because as code provides properties we want, like version control and review workflows. But optimizing only for what you need as a builder — and ignoring what your user (which could be anyone else in the organization) needs — is how you end up with a tool that no one uses.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When you’re building internal tools for everyone in the org, they need to work for everyone in the org. User experience matters more than developer convenience.&lt;/strong&gt;&lt;/p&gt;&lt;h2&gt;&amp;quot;As code&amp;quot; doesn&amp;#x27;t literally have to mean &lt;em&gt;as code&lt;/em&gt;&lt;/h2&gt;&lt;p&gt;When we started building Oblique, we started with as code definitions for access controls. That’s what IT and security teams were asking for, and what so many of them have built internally. But the more we talked to these teams, the more we realized what they wanted were the tools to preview access changes and ensure the right review requirements are met, in order to be able to safely make access changes. They wanted the &lt;em&gt;benefits&lt;/em&gt; of as code workflows.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://mitchellh.com/writing/as-code&quot;&gt;“As code” doesn’t literally have to mean &lt;em&gt;as code&lt;/em&gt;&lt;/a&gt;. You can get the same benefits by building a thoughtful system that takes your change management requirements into account, without making someone learn git.&lt;/p&gt;&lt;p&gt;As code workflows make sense for systems that engineering owns, like infrastructure. But for systems that everyone in an org needs to use, like managing corporate access, requesting policy exceptions, or reviewing vendors? Build those tools with the people who actually use them in mind.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Why your RBAC roles aren&apos;t actually roles</title><link>https://oblique.security/blog/rbac-roles/</link><guid isPermaLink="true">https://oblique.security/blog/rbac-roles/</guid><description>A role in RBAC should represent what someone actually does in your environment. Your job title makes a bad RBAC role: it&apos;s your position, not your function.</description><pubDate>Tue, 02 Sep 2025 17:06:40 GMT</pubDate><content:encoded>&lt;p&gt;Most security teams understand &lt;a href=&quot;https://csrc.nist.gov/Projects/role-based-access-control/faqs&quot;&gt;Role-Based Access Control (RBAC)&lt;/a&gt; basics: map permissions to roles, assign users to roles, and you can control access. Simple enough. Well, except that every tool you use implements RBAC &lt;em&gt;slightly&lt;/em&gt; differently, so you do need to think about it: who should be our &lt;a href=&quot;https://docs.github.com/en/organizations/managing-peoples-access-to-your-organization-with-roles/using-organization-roles#about-pre-defined-organization-roles&quot;&gt;GitHub CI/CD Admin&lt;/a&gt;? What’s the difference between a &lt;a href=&quot;https://vercel.com/docs/rbac/access-roles&quot;&gt;Vercel Member, Developer, and Contributor&lt;/a&gt;?&lt;/p&gt;&lt;p&gt;But there&amp;#x27;s a fundamental confusion that makes RBAC implementations difficult to manage: we keep using job titles as access roles. You should never encounter a &amp;quot;Growth Hacker&amp;quot; or &amp;quot;Chief Happiness Officer&amp;quot; in your access control system, even though they might be in your org chart. Your job title makes a bad RBAC role.&lt;/p&gt;&lt;h2&gt;Roles describe what you do, not who you are&lt;/h2&gt;&lt;p&gt;A role in RBAC should represent what someone actually does in your environment. Your job title, on the other hand, is about reporting structure, compensation bands, and career progression. It&amp;#x27;s your position in the corporate hierarchy, not your function.&lt;/p&gt;&lt;p&gt;Roles like &amp;quot;Customer Support&amp;quot; or &amp;quot;IT Admin&amp;quot; might make sense in both contexts, because they not only describe specific job functions, but are also tightly coupled to typical tasks someone in that position needs to accomplish. These roles are the exception rather than the norm.&lt;/p&gt;&lt;p&gt;Take a &amp;quot;Product Manager&amp;quot; instead. A Product Manager working on billing needs access to Stripe and support tickets, but a Product Manager working on Android needs access to app builds and bug reports. Same title, completely different functional needs. If you create a single &amp;quot;Product Manager&amp;quot; role, you&amp;#x27;re forced to either over-privilege everyone or create dozens of variations like &amp;quot;Product Manager - Billing&amp;quot; and &amp;quot;Product Manager - Android.&amp;quot; What you actually want is a role that describes the billing team or the Android team.&lt;/p&gt;&lt;p&gt;What makes this worse is that job titles change constantly — should changing your job title in Workday to “AI Engineer” really change access? — and what those jobs actually entail evolves even faster. In today&amp;#x27;s flexible organizations, people wear multiple hats that don&amp;#x27;t fit neatly into traditional role definitions. (Seriously, what is a &amp;quot;GtM Engineer&amp;quot;?)&lt;/p&gt;&lt;h2&gt;How we ended up in this mess&lt;/h2&gt;&lt;p&gt;When organizations first adopt RBAC, they might go through a &amp;quot;role mapping&amp;quot; exercise to consolidate permissions into common workflows. Too often, IT leads this process without sufficient business context. It&amp;#x27;s easy to map existing permissions to a role that exists in an org chart, like &amp;quot;Marketing Manager,&amp;quot; without asking if that&amp;#x27;s really the right access control boundary — or worse, a “Vendor” or “Partner”.&lt;/p&gt;&lt;p&gt;Being too generic doesn&amp;#x27;t help either. An &amp;quot;Admin&amp;quot; role doesn&amp;#x27;t actually mean anything. And &amp;quot;Data Reader&amp;quot; could represent 10 different job titles with completely different needs. You can&amp;#x27;t create meaningful roles without understanding what work actually looks like.&lt;/p&gt;&lt;p&gt;This gap has become glaringly obvious in the past few months as we rush to deploy AI agents: just like humans need appropriate roles, agents need appropriate OAuth scopes. We&amp;#x27;re struggling with the same “role explosion” problem: it’s unrealistic to create hundreds of OAuth scopes, and granting broad access is risky. We never learned to think functionally about access control for humans, so we&amp;#x27;re making the same conceptual mistakes with agents.&lt;/p&gt;&lt;h2&gt;Making RBAC work for you&lt;/h2&gt;&lt;p&gt;So, how should you think about adopting RBAC?&lt;/p&gt;&lt;p&gt;First, &lt;strong&gt;work with business teams to define roles.&lt;/strong&gt; Every organization is different, and there&amp;#x27;s no one-size-fits-all approach. Ask business teams: who does the same tasks? Who works together on this specific type of work? People with completely different job titles might end up with identical access to systems.&lt;/p&gt;&lt;p&gt;Good functional roles sound boring and obvious: “Blog Publisher”, “Invoice Processor”, or “Customer Data Viewer”. Many people across different teams might need to publish blog posts, so they all need access to the publishing platform. If your role name requires a hyphen or sounds like it belongs on a LinkedIn influencer&amp;#x27;s bio, you&amp;#x27;re probably doing it wrong. Keep definitions simple enough that non-technical stakeholders can understand — and request access to — them. Taking the time to define roles that are clear and have sufficient context, as this will save you time later.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Use fewer, broader roles rather than overfitting to every scenario.&lt;/strong&gt; When you have hundreds of roles, you&amp;#x27;ve created a system that nobody understands. Role mapping turns into “role mining” or even “role engineering”. Role sprawl isn&amp;#x27;t just annoying — it&amp;#x27;s impossible to debug when something goes wrong if you can&amp;#x27;t tell why roles exist, who should have them, or how two similar-sounding roles differ.&lt;/p&gt;&lt;p&gt;Start by only defining roles that you actually need to restrict. Split out the riskiest permissions into roles that need to be more tightly controlled, and keep a reasonable base set of permissions for everyone else. You can start with basic permission levels that apply to any resource, like Reader, Writer, and Admin. These boring old roles exist because they actually work — &lt;a href=&quot;https://cloud.google.com/iam/docs/roles-overview#basic&quot;&gt;your cloud provider didn&amp;#x27;t pick them by accident&lt;/a&gt;. Don’t define roles for the sake of it, but as you have a need to enable new users and restrict sensitive permissions. Accept some over-privilege to avoid the administrative hell of managing hundreds of hyper-specific roles. You&amp;#x27;ll have exceptions regardless, so spend your time building a decent exception process instead of trying to create the perfect role taxonomy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Plan for ongoing maintenance.&lt;/strong&gt; This is the part that often gets left by the wayside. Roles need regular review and updates — you can even do this at the same time as your user access reviews! Instead of just checking if Bob should still have the Admin role, also review what the Admin role can actually do and whether that still makes sense. Roles have a nasty habit of scope creep, quietly accumulating permissions over time as teams request access to new tools and systems. Check periodically that your &amp;quot;Viewer&amp;quot; role can&amp;#x27;t actually modify data. Audit role capabilities, not just role assignments.&lt;/p&gt;&lt;h2&gt;Mix and match your access models&lt;/h2&gt;&lt;p&gt;Real organizations are complex, people wear multiple hats, and business needs change faster than security can keep up. The issue isn’t RBAC, it’s how organizations implement RBAC.&lt;/p&gt;&lt;p&gt;Most organizations don’t implement a “pure” &lt;a href=&quot;https://oblique.security/resources/access-control-models/&quot;&gt;access control model&lt;/a&gt;. Instead, they &lt;a href=&quot;https://cheatsheetseries.owasp.org/cheatsheets/Authorization_Cheat_Sheet.html#prefer-attribute-and-relationship-based-access-control-over-rbac&quot;&gt;blend&lt;/a&gt; RBAC for stable business functions, ABAC for context-dependent decisions, and other approaches as needed. Rather than forcing everything into roles, let attributes be attributes — someone&amp;#x27;s department is useful as an attribute for access decisions, not as another role to manage.&lt;/p&gt;&lt;h2&gt;Keep RBAC functional&lt;/h2&gt;&lt;p&gt;We struggle to maintain access controls because we&amp;#x27;ve designed them around what people are called rather than what they actually do. Job functions matter for access control, but they&amp;#x27;re just one piece of the puzzle. Where you&amp;#x27;re working from, what time it is, and who you&amp;#x27;re collaborating with often matter more than your title.&lt;/p&gt;&lt;p&gt;At the end of the day, RBAC is an imperfect tool. Design a sane and maintainable set of groups, focus your security efforts on the riskiest permissions, and build processes to handle the inevitable exceptions and changes coming your way.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Comms groups inevitably become access groups</title><link>https://oblique.security/blog/comms-access-groups/</link><guid isPermaLink="true">https://oblique.security/blog/comms-access-groups/</guid><description>Comms groups map to how people actually work, but often access groups don&apos;t. Comms groups always become access groups. It&apos;s not a matter of if, but when.</description><pubDate>Thu, 28 Aug 2025 17:09:40 GMT</pubDate><content:encoded>&lt;p&gt;IT teams create two types of groups that seem logical in theory: communication groups and access groups. &amp;quot;Comms groups&amp;quot; are meant for project updates and discussions, and so are easier to join. &amp;quot;Access groups&amp;quot; are meant for actual system permissions and are much more tightly controlled — managed by IT, and with approval requirements set by security.&lt;/p&gt;&lt;p&gt;The theory sounds reasonable: keep the friction low for people who just need to stay informed, but maintain strict controls for actual access.&lt;/p&gt;&lt;p&gt;This doesn&amp;#x27;t work in practice. &lt;strong&gt;Comms groups always become access groups. It&amp;#x27;s not a matter of if, but when.&lt;/strong&gt;&lt;/p&gt;&lt;h2&gt;How groups actually get used&lt;/h2&gt;&lt;p&gt;This happens because in many platforms, the same group management systems that handle communication also control access. In GCP, your Google Groups directly become your IAM groups. In Slack, when you share a document with a channel, you&amp;#x27;re granting access to everyone in that channel. A comms group for an offsite ends up being used for access to a new tool built during the event. A social activities group for the Boston office becomes the access control for the newly installed bike racks. IT teams try to prevent this — usually by creating naming conventions like “Project-X-COMMS-ONLY”. Just like your file naming convention — &lt;a href=&quot;https://xkcd.com/1459/&quot;&gt;you know, the one where “Filename_FINAL_v3” gets superseded by “Filename_FINAL_v4_ACTUALLY_FINAL”&lt;/a&gt; — it&amp;#x27;s a beautiful lie that doesn&amp;#x27;t work.&lt;/p&gt;&lt;p&gt;Users will follow the path of least resistance. When you need to grant access to a system, and everyone who needs access is already &lt;em&gt;right there&lt;/em&gt; in that group, you use it… even if it just happens to be the COMMS-ONLY group. (It’s rare that the reverse happens.)&lt;/p&gt;&lt;p&gt;Why is the comms group often a better fit for access? Comms groups map to how people actually work, and often, access groups don&amp;#x27;t (but they should). Access groups tend to be tightly managed to meet specific compliance or security requirements. As users often need to request access to these, they become bureaucratic obstacles that get in the way of actual work.&lt;/p&gt;&lt;p&gt;Comms groups, on the other hand, naturally follow project and team lines — they contain the people you need input from or need to keep informed. IT teams typically want these groups to be easy to join since they enable business operations and seem low-risk (even when the conversations themselves involve highly sensitive information). Our access controls end up &lt;a href=&quot;https://en.wikipedia.org/wiki/Conway%27s_law&quot;&gt;mirroring our communication patterns&lt;/a&gt;, whether we plan for it or not.&lt;/p&gt;&lt;h2&gt;The gap between groups and permissions&lt;/h2&gt;&lt;p&gt;Part of the reason IT teams try to create this separation is that we fundamentally lack visibility into what we&amp;#x27;re actually granting when we add someone to a group. The system managing groups (often your IdP) isn’t the same system where your permissions live (often an application).&lt;/p&gt;&lt;p&gt;This is where security breaks down: when you add a user to a group, you&amp;#x27;re making an access decision blind to the actual access being granted. You can&amp;#x27;t see what downstream permissions a group membership grants — access to customer data, financial data, or other sensitive resources. (Look at Google Groups’ newly added &lt;a href=&quot;https://support.google.com/a/answer/10607394&quot;&gt;security groups&lt;/a&gt;, which perpetuate this naming separation while offering only limited restrictions. It&amp;#x27;s too little, too late — and too confusing to solve the real problem.)&lt;/p&gt;&lt;p&gt;IT teams try to separate comms and access groups to reduce what they need to manage, control, and audit, but sensitive access inevitably creeps into places it doesn&amp;#x27;t belong.&lt;/p&gt;&lt;h2&gt;Design for reality&lt;/h2&gt;&lt;p&gt;So what can you do? Well, stop fighting this pattern, and accept the inevitable. Instead of pretending comms and access groups will remain separate, create unified project- or team-based groups from the beginning. Make it explicit that joining the project group means both receiving comms &lt;em&gt;and&lt;/em&gt; getting relevant access.&lt;/p&gt;&lt;p&gt;The goal isn&amp;#x27;t to make it even harder for users to get access — they already find it plenty frustrating. Acknowledge that comms and access groups are one and the same, and design proper controls from the beginning.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Injection-proof SQL builders in Go </title><link>https://oblique.security/blog/injection-proof-sql/</link><guid isPermaLink="true">https://oblique.security/blog/injection-proof-sql/</guid><description>SQL builders are always one bad logic bug away from full-blown query injection. Oblique uses Go type tricks to prevent this entire class of backend issues.</description><pubDate>Mon, 18 Aug 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;A Go product that uses SQL will inevitably implement some higher level logic on top of &lt;a href=&quot;https://pkg.go.dev/database/sql&quot;&gt;database/sql&lt;/a&gt;. There are just too many cases where a single string with a fixed set of arguments isn’t flexible enough. Using different database flavors for dev and prod which take different placeholders (&lt;code&gt;&amp;quot;?&amp;quot;&lt;/code&gt; vs &lt;code&gt;&amp;quot;$1&amp;quot;&lt;/code&gt;). Inserting &lt;a href=&quot;https://stackoverflow.com/a/21112176&quot;&gt;multiple rows&lt;/a&gt; in a single statement. Performing the same query with different WHERE conditions.&lt;/p&gt;&lt;p&gt;While builders are often necessary for development, they’re also absolutely terrifying for security.&lt;/p&gt;&lt;p&gt;Sure, Go has built-in &lt;a href=&quot;https://go.dev/doc/database/sql-injection&quot;&gt;parameterized values&lt;/a&gt; for input variables, but what if we’re trying to specify column, row, or table names? Most packages will happily accept arbitrary input in these fields and run it directly against your database.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Runnable example: https://go.dev/play/p/bGiCWp6xk-z
package main

import (
	&quot;fmt&quot;

	&quot;github.com/huandu/go-sqlbuilder&quot;
)

func main() {
	userInput := `1;
		DROP TABLE students;
		SELECT (id, name) FROM demo.user WHERE status`
	where := `status = ` + userInput
	// ...
	sql := sqlbuilder.Select(&quot;id&quot;, &quot;name&quot;).From(&quot;demo.user&quot;).
		Where(where).
		String()
	fmt.Println(sql)
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When building Oblique, we weren’t thrilled with the idea of accidentally introducing a 90s vulnerability into a security product built in 2025. If a customer trusts Oblique to manage authorization in their environment, we should do more than hope our new backend hire doesn’t misuse a Go API.&lt;/p&gt;&lt;h2&gt;A better way&lt;/h2&gt;&lt;p&gt;There turns out to be a clever trick with the Go type system to ensure an argument is free from dynamic input. That way, we can constrain the builder’s inputs rather than sanitize or detect after the fact.&lt;/p&gt;&lt;p&gt;Consider the following package:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;package say

import &quot;fmt&quot;

type myString string

func Hello(name myString) {
	fmt.Printf(&quot;Hello, %s!\n&quot;, name)
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Because &lt;code&gt;myString&lt;/code&gt; is private, there’s no way for an external package to create a variable of that type.&lt;/p&gt;&lt;p&gt;This should make it impossible for another package to call &lt;code&gt;say.Hello&lt;/code&gt;, except for one notable exception. Go is relatively strict on mixing types. You can’t add an &lt;code&gt;int&lt;/code&gt; to a &lt;code&gt;float64&lt;/code&gt; or even an &lt;code&gt;int&lt;/code&gt; to an &lt;code&gt;int64&lt;/code&gt;. To compensate, Go &lt;a href=&quot;https://go.dev/blog/constants&quot;&gt;constants&lt;/a&gt; allow programs to define untyped values whose type is inferred when they’re used in some context that requires one. That’s why you can do something like the following:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const thirtyDays = 30*24*time.Hour&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Rather than explicitly typing every number:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;const thirtyDays = time.Duration(30)*time.Duration(24)*time.Hour&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The constants &lt;code&gt;30&lt;/code&gt; and &lt;code&gt;24&lt;/code&gt; are coerced to a &lt;a href=&quot;https://pkg.go.dev/time#Duration&quot;&gt;&lt;code&gt;time.Duration&lt;/code&gt;&lt;/a&gt; by being multiplied by &lt;code&gt;time.Hour&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Let’s go back to our earlier example. While another package can’t create a variable with the &lt;code&gt;myString&lt;/code&gt; type, it is possible to pass a constant! This will get typed as &lt;code&gt;myString&lt;/code&gt; simply by being used as an argument value.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Runnable example: https://go.dev/play/p/_0NyeiTp-M8
func main() {
	say.Hello(&quot;Eric&quot;) // Works even though the argument is a private type.
}&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Constants and builders&lt;/h2&gt;&lt;p&gt;We can use this observation to force callers to only pass constants to our APIs, which by definition will never be dependent on dynamic input. You can’t construct a constant using &lt;code&gt;fmt.Sprintf&lt;/code&gt; or other string concatenation that depends on a live variable.&lt;/p&gt;&lt;p&gt;Here’s a full example builder that uses private string types for column, row, and table names:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;package sqlb

import &quot;strings&quot;

// Private string types that can’t be referenced by other packages.
type col string
type row string
type table string

// Helpers to construct types dynamically, while still requiring
// constant strings as arguments.
func Row(val row) row        { return val }
func Rows(vals ...row) []row { return vals }
func Table(val table) table  { return val }

type SelectBuilder struct {
	rows    []row
	from    table
	whereEq *whereEq
}

func Select(rows ...row) *SelectBuilder {
	return &amp;SelectBuilder{rows: rows}
}

func (b *SelectBuilder) From(table table) *SelectBuilder {
	b.from = table
	return b
}

type whereEq struct {
	col col
	val any
}

func (b *SelectBuilder) WhereEq(col col, val any) *SelectBuilder {
	b.whereEq = &amp;whereEq{col, val}
	return b
}

func (s *SelectBuilder) String() string {
	b := &amp;strings.Builder{}
	b.WriteString(&quot;SELECT (&quot;)
	for i, row := range s.rows {
		if i != 0 {
			b.WriteString(&quot;, &quot;)
		}
		b.WriteString(string(row))
	}
	b.WriteString(&quot;) FROM &quot;)
	b.WriteString(string(s.from))

	if s.whereEq != nil {
		b.WriteString(&quot; WHERE &quot;)
		b.WriteString(string(s.whereEq.col))
		b.WriteString(&quot; = ?&quot;)
	}
	return b.String()
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;With this package, you can still write the basic builder logic you’d expect:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Runnable example: https://go.dev/play/p/Nauyr8LNdUr
var rows = sqlb.Rows(&quot;id&quot;, &quot;name&quot;)

func main() {
	sql := sqlb.Select(rows...).
		From(&quot;demo.user&quot;).
		WhereEq(&quot;status&quot;, 1).
		String()
	fmt.Println(sql)
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;But, if you ever accidentally depend on a string variable, the program refuses to compile!&lt;/p&gt;&lt;pre&gt;&lt;code&gt;var rows = sqlb.Rows(&quot;id&quot;, &quot;name&quot;)

func main() {
	userInput := ` = 1; DROP TABLE students;`
	whereCol := `status` + userInput
	// ...
	sql := sqlb.Select(rows...).
		From(&quot;demo.user&quot;).
		WhereEq(whereCol, 1).
		String()
	fmt.Println(sql)
}

// cannot use whereCol (variable of type string) as sqlb.col value in argument to sqlb.Select(rows...).From(&quot;demo.user&quot;).WhereEq&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Helpers like &lt;code&gt;Row(val row) row&lt;/code&gt; still allow other packages to dynamically construct the set of rows to select, but guarantee their values trace back to constants and aren’t dependent on user controlled values.&lt;/p&gt;&lt;h2&gt;Footguns&lt;/h2&gt;&lt;p&gt;Smart engineers at prominent companies accidentally write vulnerabilities like this all the time. As a codebase gets bigger, it’s just not possible to depend on human review as the security check. Nor is this restricted to SQL. Similar issues with &lt;a href=&quot;https://github.com/advisories/GHSA-77gc-fj98-665h&quot;&gt;JWT libraries&lt;/a&gt; and &lt;a href=&quot;https://go.dev/blog/osroot&quot;&gt;archive unpacking&lt;/a&gt; are routine when APIs make it easy to accidentally do the wrong thing.&lt;/p&gt;&lt;p&gt;If you are building a package with security implications, it should be as hard as possible (if not impossible) for users to do something insecure. For SQL builders, what’s better than insecure code not compiling at all?&lt;/p&gt;</content:encoded><author>Eric Chiang</author></item><item><title>The evolution of authentication, from passwords to passkeys</title><link>https://oblique.security/blog/history-of-authentication/</link><guid isPermaLink="true">https://oblique.security/blog/history-of-authentication/</guid><description>Authentication has evolved from simple passwords to federated systems with passwordless logins, continuously balancing security and usability.</description><pubDate>Wed, 13 Aug 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;em&gt;This blog post is a written version of a talk that our cofounder Maya gave at BSidesSLC in April. You can also &lt;a href=&quot;https://www.youtube.com/watch?v=whNMgLVJpWw&quot;&gt;watch the recording&lt;/a&gt; and &lt;a href=&quot;https://github.com/mayakacz/presentation-slides/blob/master/20250411%20-%20BSidesSLC%20-%20The%20evolution%20of%20auth.pdf&quot;&gt;get the slides&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;Authentication has always been defined by the tension between security and usability. For years, making authentication more secure meant making it harder to use — and making it easier to use meant introducing new security holes. Security innovations used to start in enterprise and trickle down to consumer applications. But now, that flow has reversed, with consumer authentication improvements like TouchID making their way into enterprise environments.&lt;/p&gt;&lt;p&gt;This push and pull between security and usability has been productive as both sides of the equation have improved over time. Increasingly, the best authentication controls we have are the most usable ones.&lt;/p&gt;&lt;p&gt;One thing hasn’t changed: we still need to verify that users are who they claim to be. But how we do that verification has changed: authentication has moved from simple passwords to complex passwords, from islands of identity to federated systems, and from single-factor to multi-factor authentication and passwordless experiences.&lt;/p&gt;&lt;h2&gt;Basic authentication: usernames and passwords&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/103a17cc6501cb3eeb9141115b86c82774a9571a-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Passwords are relatively new. In 1961, Fernando Corbató at MIT &lt;a href=&quot;https://www.bbc.com/news/technology-48988091&quot;&gt;created passwords&lt;/a&gt; for time-sharing systems where multiple users needed different access levels and private files. This ensured users were only authorized to access their files and use their assigned time on the shared system.&lt;/p&gt;&lt;p&gt;Just a few years later, &lt;a href=&quot;https://www.wired.com/2012/01/computer-password/&quot;&gt;a doctoral student who wanted more time on that system realized that he could print out the list of all of the passwords, and login as someone else&lt;/a&gt;. Turns out the passwords were just stored in plaintext.&lt;/p&gt;&lt;p&gt;To protect against this password theft, &lt;a href=&quot;https://www.usenix.org/publications/loginonline/bcrypt-25-retrospective-password-security&quot;&gt;Unix introduced password hashing&lt;/a&gt; in the 1970s. By using a one-way function (a hash), the system can store enough information to verify that users are presenting the right passwords, without storing, and therefore risk leaking, the passwords themselves.&lt;/p&gt;&lt;p&gt;As the number of systems that a user might need to log into increased, they had a corresponding need for more passwords, and so frequently a user would reuse the same password to authenticate to multiple systems. However, if a user reused a password in multiple places, and any one of those instances was compromised — for example, if the system didn’t actually hash and store passwords properly — then that password could be tested against another system to see if it would give access. Attackers could then use these compromised credentials in targeted attacks against specific users (&lt;a href=&quot;https://en.wikipedia.org/wiki/Credential_stuffing&quot;&gt;credential stuffing&lt;/a&gt; attacks) or feed them into broader &lt;a href=&quot;https://en.wikipedia.org/wiki/Password_cracking&quot;&gt;password cracking&lt;/a&gt; efforts to expand dictionaries of common passwords and their corresponding hashes.&lt;/p&gt;&lt;p&gt;To make their passwords easier to remember, users started using variations on the same passwords. For example, rather than using the &lt;a href=&quot;https://github.com/danielmiessler/SecLists/blob/master/Passwords/Common-Credentials/10k-most-common.txt&quot;&gt;27th most common password &lt;code&gt;hunter&lt;/code&gt;&lt;/a&gt;, they might instead use the password &lt;a href=&quot;https://knowyourmeme.com/memes/hunter2&quot;&gt;&lt;code&gt;hunter2&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;&lt;h2&gt;Password complexity requirements&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/7334c9b561ee88a402b9c8e709bb3595aca4802a-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;To prevent weak passwords like &lt;code&gt;password123&lt;/code&gt;, organizations introduced password complexity requirements. Rather than letting a user set any password, users’ passwords had to meet specific requirements in order to make them harder to crack.&lt;/p&gt;&lt;p&gt;When &lt;a href=&quot;https://csrc.nist.gov/csrc/media/publications/sp/800-63/ver-10/archive/2004-06-30/documents/sp800-63-v1-0.pdf&quot;&gt;NIST SP 800-63&lt;/a&gt; was first published in 2004, it suggested minimum requirements for password complexity, including irregular capitalization, special characters, and at least one numeral. Although it’s mathematically true that passwords using a variety of characters greater than just lowercase letters should be harder to crack, in reality, many passwords that humans create to meet these requirements end up becoming a &lt;a href=&quot;https://www.youtube.com/watch?v=aHaBH4LqGsI&quot;&gt;predictable ordering&lt;/a&gt; of capital letters, lowercase letters, numbers, and symbols, like &lt;code&gt;Password123!&lt;/code&gt; or &lt;a href=&quot;https://en.wikipedia.org/wiki/Munged_password&quot;&gt;leetspeak&lt;/a&gt; like &lt;code&gt;P@ssw0rd!&lt;/code&gt;. Meeting specific password requirements can be &lt;a href=&quot;https://neal.fun/password-game/&quot;&gt;infuriating&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Password complexity requirements didn’t improve security — they made it worse. Users couldn’t remember complex passwords, so they’d reset them more frequently. And, there often were additional requirements around regular password rotation. An easy-to-remember password is not secure, but a secure password is hard to remember, harder to type, and so less usable. It’s an unfortunate tradeoff.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-63b.pdf&quot;&gt;The current industry guidelines&lt;/a&gt; instead focus on password entropy — which is achievable with a long password, like &lt;a href=&quot;https://xkcd.com/936/&quot;&gt;&lt;code&gt;correcthorsebatterystaple&lt;/code&gt;&lt;/a&gt; — and discourage specific complexity rules and rotation of non-compromised credentials. Security doesn’t have to be completely at odds with usability: rather than a perfectly random password, &lt;a href=&quot;https://blog.1password.com/a-smarter-password-generator/&quot;&gt;a fairly random but pronounceable password&lt;/a&gt; is much more usable and still more secure.&lt;/p&gt;&lt;p&gt;Password storage has also improved: it’s easier than ever to properly salt and hash passwords. &lt;a href=&quot;https://en.wikipedia.org/wiki/Argon2&quot;&gt;&lt;code&gt;argon2&lt;/code&gt;&lt;/a&gt;, as the winner of the &lt;a href=&quot;https://www.password-hashing.net/&quot;&gt;Password Hashing Competition&lt;/a&gt;, has taken over from &lt;a href=&quot;https://en.wikipedia.org/wiki/Bcrypt&quot;&gt;&lt;code&gt;bcrypt&lt;/code&gt;&lt;/a&gt; — not only is it fast, it also lets the implementer trade off speed and memory usage with requirements around side-channel and cracking attacks.&lt;/p&gt;&lt;p&gt;But by moving to long, high-entropy passwords that are unique for each system, this created yet another issue: how can users possibly remember all of these passwords?&lt;/p&gt;&lt;h2&gt;Password managers&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/46e0ed34e6591afcebaa5bfdc32b297fa146a931-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Password managers helped users manage their growing collections of credentials — &lt;a href=&quot;https://nordpass.com/blog/how-many-passwords-does-average-person-have/&quot;&gt;168 passwords on average&lt;/a&gt;. Available as both standalone apps like &lt;a href=&quot;https://1password.com/&quot;&gt;1Password&lt;/a&gt; and built directly into platforms like &lt;a href=&quot;https://passwords.google.com/&quot;&gt;Google Chrome&lt;/a&gt; and &lt;a href=&quot;https://support.apple.com/en-us/104955&quot;&gt;iOS&lt;/a&gt;. Password managers have become mainstream security advice, even superseding VPNs for family recommendations.&lt;/p&gt;&lt;p&gt;Password managers also gave IT teams visibility into password practices for the first time. Organizations could finally enforce password policies and see whether users were following them. But password manager adoption remained inconsistent, and the core problem persisted: too many systems requiring separate credentials.&lt;/p&gt;&lt;h2&gt;Enterprise identity federation&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/53762c926e4469d6b1811b85e1d534fc4d3a3b9f-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Identity federation emerged because organizations hit two walls. Users couldn’t manage dozens of passwords, and IT teams couldn’t track access across multiple systems. Centralized identity management solved both problems: users get one set of credentials to remember, and IT gets a single place to manage access.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/active-directory-domain-services-overview&quot;&gt;Microsoft Active Directory (AD)&lt;/a&gt;, launched in 1999, was the first real solution to enterprise identity sprawl. Instead of maintaining separate user accounts for every server, printer, and application, IT teams could manage everything from one directory. AD became the “single source of truth” for who had access to what within the corporate network. AD implemented &lt;a href=&quot;https://en.wikipedia.org/wiki/Kerberos_(protocol)&quot;&gt;Kerberos&lt;/a&gt; as its authentication protocol, including Kerberos’ delegation and impersonation capabilities, which allowed services to act on behalf of users within the enterprise network.&lt;/p&gt;&lt;p&gt;However, AD only worked on the local network. &lt;a href=&quot;https://en.wikipedia.org/wiki/SAML_2.0&quot;&gt;SAML&lt;/a&gt;, introduced in 2001, tried to solve this by letting identity providers send authentication statements and attributes across domain boundaries, extending identity beyond solely AD-based systems.&lt;/p&gt;&lt;p&gt;When cloud services emerged in the mid-2000s, identity needed to extend even further. Hosted identity providers like &lt;a href=&quot;https://www.pingidentity.com/en.html&quot;&gt;Ping Identity&lt;/a&gt; (founded in 2002) and &lt;a href=&quot;https://www.okta.com/&quot;&gt;Okta&lt;/a&gt; (founded in 2009) were introduced to bridge the gap: to offer a way to federate on-premises identities into cloud SaaS applications. They weren’t replacing AD, but extending it to work with SaaS applications. This was the beginning of modern identity federation.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://oauth.net/2/&quot;&gt;OAuth&lt;/a&gt;, introduced in 2007, brought delegation to web applications and third-party services. Instead of sharing passwords, users could grant specific permissions to external applications. Later, &lt;a href=&quot;https://openid.net/developers/how-connect-works/&quot;&gt;OpenID Connect&lt;/a&gt;, standardized in 2014, extended OAuth to handle authentication in addition to authorization. Together, these created a comprehensive identity stack:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Centralized identity stores (AD, LDAP) for centralized user, group, and access management&lt;/li&gt;&lt;li&gt;Federation protocols (SAML) to connect across domain boundaries&lt;/li&gt;&lt;li&gt;Cloud identity providers (Okta, Ping) to extend directories to the cloud&lt;/li&gt;&lt;li&gt;Authorization frameworks (OAuth) and authentication extensions (OpenID Connect) to authenticate delegated authority&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The early 2000s Microsoft ecosystem was the golden age of enterprise identity — everything integrated smoothly, letting you authenticate the same way to your desktop, network, and printer.&lt;/p&gt;&lt;h2&gt;Social login&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/04811bf5d240aaf902c91b17e3cdc2ba4892e842-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Consumer identity saw similar consolidation. Social login emerged with &lt;a href=&quot;https://developers.facebook.com/blog/post/2008/05/09/announcing-facebook-connect/&quot;&gt;Facebook Connect&lt;/a&gt; (2008) and &lt;a href=&quot;https://support.google.com/accounts/answer/12849458?hl=en&quot;&gt;Sign in with Google&lt;/a&gt; (2011), letting users use existing accounts to federate their personal consumer identities, rather than needing to create new accounts for every site they visited.&lt;/p&gt;&lt;p&gt;Social login succeeded because it solved the same problems enterprises were facing: reducing password fatigue while shifting authentication burden to providers with dedicated security teams. The time-consuming and risky processes for password reset and account recovery could instead be handled by providers with significantly more resources and experience.&lt;/p&gt;&lt;h2&gt;Multi-factor authentication&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/f71f1537edd1dc490bf12467edff52a5f4c5aaef-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Organizations moved from a single password for authentication, to two-factor authentication (2FA), to multi-factor authentication (MFA).&lt;/p&gt;&lt;p&gt;MFA means that just a password isn’t enough — instead, to verify a user, multiple verification types are needed. These are typically described as: “something you know” (like a password), “something you have” (like a device), and “something you are” (using biometrics). The idea is that merely a compromised password alone shouldn’t grant access — attackers would need to compromise multiple factors.&lt;/p&gt;&lt;p&gt;Many kinds of factors have existed over the years:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/RSA_SecurID&quot;&gt;RSA SecurID tokens&lt;/a&gt; (1980s): The original “something you have”: physical devices on a keychain that generated time-based codes. Effective but expensive and annoying to carry — plus, the batteries died.&lt;/li&gt;&lt;li&gt;SMS codes (mid-2000s): Convenient, and what is still probably the most widely used second factor today. Though vulnerable to &lt;a href=&quot;https://en.wikipedia.org/wiki/SIM_swap_scam&quot;&gt;SIM swapping&lt;/a&gt;, SMS codes were crucial for making MFA mainstream and accepted. (Plus, they gave us all a legitimate reason to keep our phones within arm’s reach at all times!)&lt;/li&gt;&lt;li&gt;Authenticator apps (2010s): &lt;a href=&quot;https://play.google.com/store/apps/details?id=com.google.android.apps.authenticator2&amp;amp;hl=en_US&quot;&gt;Google Authenticator&lt;/a&gt; (2010), &lt;a href=&quot;https://duo.com/&quot;&gt;Duo&lt;/a&gt; (2010), &lt;a href=&quot;https://www.authy.com/&quot;&gt;Authy&lt;/a&gt; (2011) improved security with TOTP codes generated on the user’s device.&lt;/li&gt;&lt;li&gt;Push notifications (mid-2010s): Instead of generating codes, what if those apps instead sent push notifications? Much more convenient, but these caused “notification fatigue” — users would approve notifications without thinking, even when they hadn’t actually tried to log in. This made phishing attacks easier, not harder.&lt;/li&gt;&lt;li&gt;Hardware security keys (mid-2010s): &lt;a href=&quot;https://www.yubico.com/products/&quot;&gt;YubiKeys&lt;/a&gt; and other &lt;a href=&quot;https://fidoalliance.org/specs/u2f-specs-master/fido-u2f-overview.html&quot;&gt;FIDO U2F&lt;/a&gt; hardware tokens generate unique codes when touched, providing proof of presence without requiring users to type anything. This allows for much longer codes, improving security without requiring more effort from the user. This became best practice when &lt;a href=&quot;https://www.yubico.com/blog/use-of-fido-u2f-security-keys-focus-of-2-year-google-study/&quot;&gt;Google published a case study&lt;/a&gt; that showed that implementing hardware keys effectively eliminated successful phishing attacks in their environment.&lt;/li&gt;&lt;li&gt;Platform authenticators (mid-2010s): TPMs in devices enabled hardware-bound key material. Combined with biometrics like &lt;a href=&quot;https://support.apple.com/en-us/102528&quot;&gt;TouchID&lt;/a&gt; (2013), &lt;a href=&quot;https://support.microsoft.com/en-us/windows/configure-windows-hello-dae28983-8242-bb2a-d3d1-87c9d265a5f0&quot;&gt;Windows Hello&lt;/a&gt; (2015), and &lt;a href=&quot;https://support.apple.com/en-us/108411&quot;&gt;FaceID&lt;/a&gt; (2017), these verify multiple factors: that both the user has the device, and the user is the person they say they are.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The biggest barrier to MFA adoption isn’t technical complexity — it’s that many applications still don’t support MFA. &lt;a href=&quot;http://sso.tax/&quot;&gt;Just like with SSO&lt;/a&gt;, you’ll find apps that don’t offer MFA at all, or charge extra for it as an “enterprise” feature.&lt;/p&gt;&lt;h2&gt;Passwordless&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/8e0a0b933feafc801179caff548b8123cd044a36-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;Passwordless authentication takes MFA to its logical conclusion: eliminating passwords entirely. Instead of adding secure factors to passwords, make the secure factor the only factor.&lt;/p&gt;&lt;p&gt;This first started with magic links (mid-2010s), which allowed for email-based authentication without passwords by sending a user either a link or a code via email. Magic links were especially popular for social logins, where forgetting your password meant that you could still access your account via email. However, these links are phishable, and allow for account takeover if your email is compromised.&lt;/p&gt;&lt;p&gt;The passwordless movement took off as biometrics became possible, then normal, and eventually, mainstream. Device biometrics (mid-2010s) allowed for platform authenticators like TouchID, Windows Hello, and FaceID to auth using a user’s biometrics — but instead of using these for second factors, using these to secure primary factors.&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;https://webauthn.guide/&quot;&gt;WebAuthn standard&lt;/a&gt; launched in 2019, specifying how to implement FIDO2 for web APIs. By standardizing passwordless authentication for browsers, WebAuthn enabled phishing-resistant credentials across the web.&lt;/p&gt;&lt;p&gt;WebAuthn succeeded because it achieved the seemingly impossible: improving both security and user experience simultaneously. Hardware-bound credentials resist phishing since they’re tied to both physical devices and specific services. When FaceID replaced typing passwords on mobile devices — or entering a 2FA code, or selecting from a password manager — users adopted it because it was faster and easier, not because of its security benefits.&lt;/p&gt;&lt;h2&gt;Passkeys&lt;/h2&gt;&lt;figure&gt;&lt;img src=&quot;https://cdn.sanity.io/images/dlxnfmjc/production/b0a2ff6d7e8e22319f8da629b8c1881333012f39-466x834.png?w=3000&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;a href=&quot;https://fidoalliance.org/passkeys/&quot;&gt;Passkeys&lt;/a&gt; were supposed to make WebAuthn mainstream. The rollout has been messy. &lt;a href=&quot;https://fidoalliance.org/mobileidworld-tech-giants-microsoft-google-and-apple-drive-global-passkey-adoption-with-visa-support/&quot;&gt;Apple, Google, and Microsoft promised a better user experience&lt;/a&gt; in 2022: biometric-based FIDO credentials that work across platforms and devices. This would also improve security, as these phishing-resistant credentials are bound to specific origins.&lt;/p&gt;&lt;p&gt;Instead, passkeys created vendor lock-in and user confusion. &lt;a href=&quot;https://world.hey.com/dhh/passwords-have-problems-but-passkeys-have-more-95285df9&quot;&gt;Passkeys have been confusing at best&lt;/a&gt;, due to inconsistent experiences across providers. Cross-device syncing only works within ecosystems: Apple devices to Apple devices, and Chrome sessions to Chrome sessions. Users with multiple platforms face competing password managers during every signup or login, uncertainty about where their passkeys are stored, and unclear account recovery processes. And, passkeys still have &lt;a href=&quot;https://passkeys.directory/&quot;&gt;limited support&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The industry was so excited about WebAuthn and passkeys (rightly so) that we bungled the user experience. Biometric unlocking for password managers and direct passkey authentication compete for attention during the same workflows. Despite being more secure and theoretically easier to use, &lt;a href=&quot;https://arstechnica.com/security/2024/12/passkey-technology-is-elegant-but-its-most-definitely-not-usable-security/&quot;&gt;passkeys aren’t something most of us would recommend to our families&lt;/a&gt;. The industry tried to create a better user experience and made things more confusing instead.&lt;/p&gt;&lt;p&gt;That said, passkeys do represent meaningful progress: moving from something users know (passwords) to something users both are and have (devices with biometrics).&lt;/p&gt;&lt;h2&gt;Looking forward&lt;/h2&gt;&lt;p&gt;The evolution continues beyond just login. Authentication and authorization are blurring together. As authentication becomes more continuous and adaptive, systems now verify both identity and permissions for every action, not just at login. This means usability challenges in authentication increasingly apply to authorization too.&lt;/p&gt;&lt;p&gt;The tension between security and usability that defined authentication’s early history is finally resolving. The best authentication controls — hardware security keys, TouchID, well-implemented SSO — are both more secure and more usable than what they replaced.&lt;/p&gt;&lt;p&gt;This matters for security teams beyond just authentication. Instead of assuming that better security means more user friction, look for solutions that improve both. The security controls that actually work are the ones people want to use.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Authenticating GitHub Actions without API keys </title><link>https://oblique.security/blog/github-actions-identity/</link><guid isPermaLink="true">https://oblique.security/blog/github-actions-identity/</guid><description>Instead of minting long-lived API keys and warning users “keep this secret,” let&apos;s use GitHub Action&apos;s OpenID Connect support instead.</description><pubDate>Thu, 31 Jul 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;At Oblique, we’re building a modern service to define and manage authorization in your corporate environment, and an early “modern” decision we made was to be API-first. Everything a user sees in our UI should integrate seamlessly with RPC clients, configuration-as-code, and MCP clients.&lt;/p&gt;&lt;p&gt;This naturally leads into the question of how we authenticate those clients. Sure, we can mint a long-lived API key and allow-list your CI/CD system’s &lt;a href=&quot;https://api.github.com/meta&quot;&gt;~5000 CIDR ranges&lt;/a&gt;. But asking a user to copy and paste key material with a big “Keep this secret!” warning isn’t exactly what I’d describe as “seamless.”&lt;/p&gt;&lt;p&gt;Instead, let’s talk about workload identity and &lt;a href=&quot;https://docs.github.com/en/actions/concepts/security/openid-connect&quot;&gt;GitHub Action’s OpenID Connect&lt;/a&gt; support.&lt;/p&gt;&lt;h2&gt;OpenID Connect&lt;/h2&gt;&lt;p&gt;&lt;a href=&quot;https://openid.net/developers/how-connect-works/&quot;&gt;OpenID Connect&lt;/a&gt; is a protocol on top of &lt;a href=&quot;https://oauth.net/2/&quot;&gt;OAuth 2.0&lt;/a&gt; that allows third-party services to determine your email, account ID, and display name using standard fields (rather than every IdP implementing its own custom set of APIs). If I’m trying to figure out your email during an OAuth flow and I’m using OpenID Connect, the same code I write to log you into Google now works with &lt;a href=&quot;https://learn.microsoft.com/en-us/entra/identity-platform/v2-protocols-oidc&quot;&gt;Microsoft Entra&lt;/a&gt;, &lt;a href=&quot;https://www.keycloak.org/&quot;&gt;Keycloak&lt;/a&gt;, &lt;a href=&quot;https://dexidp.io/&quot;&gt;Dex&lt;/a&gt;, and &lt;a href=&quot;https://www.okta.com/openid-connect/&quot;&gt;Okta&lt;/a&gt;. Aren’t standards great?&lt;/p&gt;&lt;p&gt;At a low-level, the OpenID Connect token response includes a signed JWT from the IdP with pre-defined fields. Here’s an example payload from Google’s docs:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
  &quot;iss&quot;: &quot;https://accounts.google.com&quot;,
  &quot;azp&quot;: &quot;1234987819200.apps.googleusercontent.com&quot;,
  &quot;aud&quot;: &quot;1234987819200.apps.googleusercontent.com&quot;,
  &quot;sub&quot;: &quot;10769150350006150715113082367&quot;,
  &quot;at_hash&quot;: &quot;HK6E_P6Dh8Y93mRNtsDB1Q&quot;,
  &quot;hd&quot;: &quot;example.com&quot;,
  &quot;email&quot;: &quot;jsmith@example.com&quot;,
  &quot;email_verified&quot;: &quot;true&quot;,
  &quot;iat&quot;: 1353601026,
  &quot;exp&quot;: 1353604926,
  &quot;nonce&quot;: &quot;0394852-3190485-2490358&quot;
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, if you give a developer a JWT signed by Google that says “this is &lt;a href=&quot;mailto:eric@oblique.security&quot;&gt;eric@oblique.security&lt;/a&gt;,” they’re going to (ab)use them. Over the years, systems started accepting ID Tokens as a primary credential outside of OAuth2.0, and today you can authenticate directly to systems like &lt;a href=&quot;https://kubernetes.io/docs/reference/access-authn-authz/authentication/#openid-connect-tokens&quot;&gt;Kubernetes&lt;/a&gt;, &lt;a href=&quot;https://developer.hashicorp.com/vault/docs/auth/jwt&quot;&gt;Vault&lt;/a&gt;, and &lt;a href=&quot;https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html&quot;&gt;AWS&lt;/a&gt; by presenting the ID Token itself from a command line or other non-browser system.&lt;/p&gt;&lt;h2&gt;GitHub Actions credentials&lt;/h2&gt;&lt;p&gt;GitHub Actions can request short-lived ID Tokens through an &lt;a href=&quot;https://docs.github.com/en/actions/reference/security/oidc#methods-for-requesting-the-oidc-token&quot;&gt;internal API&lt;/a&gt; exposed to the action. That token’s payload contains metadata signed by GitHub about the runtime environment and what triggered the run. Here’s an example payload with some fields omitted for brevity:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{
  &quot;actor&quot;: &quot;ericchiang&quot;,
  &quot;actor_id&quot;: &quot;2342749&quot;,
  &quot;aud&quot;: &quot;oblique.security&quot;,
  &quot;exp&quot;: 1753783724,
  &quot;iat&quot;: 1753762124,
  &quot;iss&quot;: &quot;https://token.actions.githubusercontent.com&quot;,
  &quot;ref&quot;: &quot;refs/heads/bash-script&quot;,
  &quot;ref_protected&quot;: &quot;false&quot;,
  &quot;ref_type&quot;: &quot;branch&quot;,
  &quot;repository&quot;: &quot;ericchiang/github-actions-oidc-example&quot;,
  &quot;runner_environment&quot;: &quot;github-hosted&quot;,
  &quot;sub&quot;: &quot;repo:ericchiang/github-actions-oidc-example:ref:refs/heads/bash-script&quot;
}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;For whatever reason, GitHub chose to implement Action identity as a single issuer for &lt;em&gt;all&lt;/em&gt; workloads using &lt;em&gt;global&lt;/em&gt; signing keys. This means that &lt;strong&gt;any Action can mint a valid signature for any system that trusts GitHub Actions.&lt;/strong&gt; Yes, my dinky test repo can authenticate to your &lt;a href=&quot;https://medium.com/@bdalpe/using-github-actions-tokens-for-authentication-to-kubernetes-clusters-6032170935b9&quot;&gt;Kubernetes cluster&lt;/a&gt;. It hopefully isn’t &lt;em&gt;authorized&lt;/em&gt; to do anything, but it’s up to the cluster to validate the action.&lt;/p&gt;&lt;p&gt;If the system that’s receiving the tokens doesn’t support custom field validation, you’re forced to pattern match the subject (“sub”) field, which follows the form &lt;a href=&quot;https://docs.github.com/en/actions/reference/security/oidc#example-subject-claims&quot;&gt;“repo:&amp;lt;repo&amp;gt;:ref:&amp;lt;ref&amp;gt;”&lt;/a&gt; for branches. This can lead to some &lt;a href=&quot;https://docs.github.com/en/actions/how-tos/secure-your-work/security-harden-deployments/oidc-in-aws#configuring-the-role-and-trust-policy&quot;&gt;wonky config files&lt;/a&gt;, particularly if you want to authorize pull requests differently than protected branches. GitHub even provides organizational policies for &lt;a href=&quot;https://docs.github.com/en/actions/reference/security/oidc#customizing-the-subject-claims-for-an-organization-or-repository&quot;&gt;custom subject templates&lt;/a&gt; which, while undoubtedly useful, seems especially cursed.&lt;/p&gt;&lt;p&gt;For Oblique, validating the token is easy using &lt;a href=&quot;https://github.com/coreos/go-oidc&quot;&gt;go-oidc&lt;/a&gt;. Here’s ~20 of lines of code to do that:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;// Public keys for verifying signatures and other metadata are queried
// using this issuer URL.
const actionIssuer = &quot;https://token.actions.githubusercontent.com&quot;
provider, err := oidc.NewProvider(ctx, actionIssuer)
if err != nil {
	// ...
}
config := &amp;oidc.Config{
	// MUST match the &quot;audience&quot; used when minting the token.
	ClientID: &quot;oblique&quot;,
}
verifier := provider.Verifier(config)

// Verify the signature, issuer, audience, and expiry of the token.
idToken, err := verifier.Verify(ctx, rawIDToken)
if err != nil {
	// ...
}

// For a complete list, see:
// https://docs.github.com/en/actions/reference/security/oidc#custom-claims-provided-by-github
var claims struct {
	// GitHub username that triggered the action.
	Actor string `json:&quot;actor&quot;`
	// The name of the branch in the form &quot;refs/heads/&lt;branch&gt;&quot;.
	Ref string `json:&quot;ref&quot;`
	// The full name of the repository, of the form &quot;&lt;org&gt;:&lt;repo&gt;&quot;.
	Repository string `json:&quot;repository&quot;`
	// Will be set if using deployment environments.
	Environment string `json:&quot;environment&quot;`
}
if err := idToken.Claims(&amp;claims); err != nil {
	// ...
}
// Use claims to determine if the Action is authorized.&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Based on the branch and repo fields, we can then make decisions to allow or deny access to our API without any copy and pasting of API keys.&lt;/p&gt;&lt;h2&gt;Workload identity&lt;/h2&gt;&lt;p&gt;It’s rare that a good security outcome provides a significantly better experience for users. Workload identity is absolutely one of those exceptions. You want users to be able to say “please let this action use Terraform to manage the service” without juggling API keys or figuring out how to store those credentials.&lt;/p&gt;&lt;p&gt;While I can (and will) gripe about some of the specifics about the implementation, we shouldn’t just ask for but expect identity primitives like this from any CI/CD, cloud, or infra product. If a platform lets you run code, it should be able to authenticate itself to an external system without pre-configuring an API key.&lt;/p&gt;</content:encoded><author>Eric Chiang</author></item><item><title>Good justifications write themselves</title><link>https://oblique.security/blog/justification-fields/</link><guid isPermaLink="true">https://oblique.security/blog/justification-fields/</guid><description>Organizations ask users to fill out justification fields when requesting access, but these are useless explanations. You should already have the context.</description><pubDate>Fri, 25 Jul 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Your authorization system needs context about why users have access to resources. You might think this is just for compliance — user access reviews, audit trails, the usual checkbox exercises. But context matters more than that.&lt;/p&gt;&lt;p&gt;Access should be &lt;em&gt;explainable&lt;/em&gt;. When you understand why someone has access, you can make confident decisions about managing it. It’s hard to know whether it’s safe to remove access if you don’t even know why it’s there to begin with. When you’re reviewing audit logs, meaningful context helps you quickly understand what changed and why.&lt;/p&gt;&lt;p&gt;The typical implementation is a “justification” field on access requests. A user wants Salesforce access via Slack, they get a text box. What do they actually write?&lt;/p&gt;&lt;blockquote&gt;Please give me access to Salesforce&lt;/blockquote&gt;&lt;blockquote&gt;working on project Alpha&lt;/blockquote&gt;&lt;blockquote&gt;I’m a new SDR&lt;/blockquote&gt;&lt;blockquote&gt;asdf&lt;/blockquote&gt;&lt;p&gt;This isn’t the context you’re looking for. This is users trying to get past your form validation so that they can get back to work. And this information decays: we canned Project Alpha two years ago (RIP) and Leo has been promoted to Account Executive. (Congrats, Leo!)&lt;/p&gt;&lt;p&gt;Some organizations try structured justifications with regex patterns:&lt;/p&gt;&lt;blockquote&gt;bug #12345&lt;/blockquote&gt;&lt;blockquote&gt;Acme Inc. support ticket #ABC123&lt;/blockquote&gt;&lt;p&gt;This works well when access is tied to specific work items. If the only reason an individual needs access is due to a specific bug or support ticket, this works great! But who says your ticket has sufficient context and isn’t just yet-another-text-box? Most access requests aren’t about one-off tasks — they’re for routine work that happens every day without a corresponding ticket number.&lt;/p&gt;&lt;p&gt;Most justification should be inferred. If someone works in Sales, of course they need Salesforce access. If they’re on the infrastructure team, they probably need AWS access. You already have this context — job titles, team membership, project assignments. Use it.&lt;/p&gt;&lt;p&gt;You’ll still need that freeform text field as an escape hatch for truly exceptional cases. But recognize it for what it is — a last resort that will generate lower-quality context than the inferred or structured justifications you can implement for most scenarios.&lt;/p&gt;&lt;p&gt;Good context doesn’t have to come from a literal justification field. When you’re doing access reviews or investigating changes, you should already have enough information to understand why access exists. If you don’t, fixing your role definitions and team mappings will serve you better than asking users to fill in more text boxes.&lt;/p&gt;&lt;p&gt;Context is essential for explainable access, but you have multiple ways to get it. Infer it from existing organizational data whenever possible. Use structured justifications for specific work items. Reserve freeform explanations for genuine exceptions. Design your permissions around these different sources of context, and you’ll spend less time chasing users for explanations that don’t actually explain anything.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Chesterton&apos;s fence doesn&apos;t apply to access controls</title><link>https://oblique.security/blog/chestertons-fence/</link><guid isPermaLink="true">https://oblique.security/blog/chestertons-fence/</guid><description>IT teams are scared to remove access they don&apos;t understand, leading to sprawling entitlements. Removing unused access isn&apos;t risky — never removing access is.</description><pubDate>Fri, 27 Jun 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;IT teams reluctantly apply this principle to access controls. They’re terrified of removing access because they don’t understand why it was granted in the first place. What if removing it breaks something? This fear consistently trumps the principle of least privilege, leaving organizations with sprawling access that nobody understands. Why not &lt;a href=&quot;https://bsky.app/profile/kjhealy.co/post/3lsj3vj7isk2a&quot;&gt;apply Occam’s razor instead?&lt;/a&gt; The simplest explanation for why someone has admin access to everything is that they must need it, right?&lt;/p&gt;&lt;p&gt;But access controls work differently than fences.&lt;/p&gt;&lt;h2&gt;Use the data you already have&lt;/h2&gt;&lt;p&gt;The fear is real: remove the wrong access and someone can’t do their job (or the cows escape). But just because you don’t understand why access was granted in the first place (the core issue in Chesterton’s fence) doesn’t mean you can’t validate that it’s safe to remove.&lt;/p&gt;&lt;p&gt;Unlike rebuilding a fence, regranting access &lt;em&gt;should&lt;/em&gt; be straightforward — except that in most organizations, granting access can take days of approvals that everyone dreads. If we can make getting back that access really &lt;em&gt;really&lt;/em&gt; easy, then is that such a big deal? The fear diminishes when the recovery process is painless.&lt;/p&gt;&lt;p&gt;And unlike mysterious fences, access controls generally have logs. Use them. You might not know why someone got access to customer data many years ago — and the person who granted that access or built that fence might be long gone — but you can see the last time that access was used. Preserving unused access isn’t security, it’s digital hoarding.&lt;/p&gt;&lt;p&gt;Removing access that hasn’t been used in the last 90 days likely has low risk of actual business impact. This is where most access cleanup should start. Don’t try to understand the original intent behind every permission: look at what’s actually being used and start there.&lt;/p&gt;&lt;h2&gt;Context still matters&lt;/h2&gt;&lt;p&gt;The core insight of Chesterton’s fence — that &lt;em&gt;context matters&lt;/em&gt; — absolutely applies to access controls. Good access management isn’t about preserving historical decisions — it’s about making informed decisions based on current data. You can’t make your access controls more manageable, and move towards the principle of least privilege, without context.&lt;/p&gt;&lt;p&gt;The best context to have is why access was originally granted. Without capturing that context, every change feels risky. But we often have the next best thing: current usage patterns help us understand the actual risk of removal. Sometimes the fence really was just someone’s mistake, sitting there for years because everyone was too afraid to remove it. (Why &lt;em&gt;do&lt;/em&gt; I have access to Salesforce? We’ll never know…)&lt;/p&gt;&lt;p&gt;Stop preserving access out of fear. Start with what’s unused, make recovery fast, and gradually build from there. The biggest risk isn’t removing the wrong access — it’s never removing any access at all.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item><item><title>Identity management is harder than it should be</title><link>https://oblique.security/blog/identity-management-is-hard/</link><guid isPermaLink="true">https://oblique.security/blog/identity-management-is-hard/</guid><description>Identity management is surprisingly hard: access controls change constantly and require context. We founded Oblique to work on impactful security problems.</description><pubDate>Mon, 23 Jun 2025 15:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Our thinking on identity hasn’t evolved in a long time. The &lt;a href=&quot;https://csrc.nist.gov/projects/role-based-access-control&quot;&gt;predominant access control model&lt;/a&gt; of &lt;a href=&quot;https://csrc.nist.gov/pubs/conference/1992/10/13/rolebased-access-controls/final&quot;&gt;Role-Based Access Controls (RBAC)&lt;/a&gt; was developed in the 1990s, while the more modern alternative of &lt;a href=&quot;https://csrc.nist.gov/pubs/sp/800/162/upd2/final&quot;&gt;Attribute-Based Access Control (ABAC)&lt;/a&gt; is already a decade old. We’re having the same conversations now we’ve been having for years: Should access be granted on demand? How should you manage service account identities? What happens when someone’s manager changes? Most organizations still struggle to implement sane access controls.&lt;/p&gt;&lt;p&gt;The problem isn’t missing frameworks or standards. We have plenty of those. The problem is that identity management in practice looks nothing like identity management in theory.&lt;/p&gt;&lt;h2&gt;Identity &lt;em&gt;should&lt;/em&gt; be simple&lt;/h2&gt;&lt;p&gt;Most organizations aim for a reasonable approach that should, in theory, ensure only authenticated and authorized users can access resources: authenticate using SSO through an identity provider, ideally with hardware-backed MFA, and sync users and groups with SCIM provisioning. When fully implemented, these controls work well. The challenge is that any gaps in coverage — whether it’s legacy applications that don’t support SSO or SCIM, incomplete SAML logouts, or weak MFA — create openings that attackers consistently exploit. Groups like LAPSUS$ have taken advantage of these gaps &lt;a href=&quot;https://www.okta.com/blog/2022/03/updated-okta-statement-on-lapsus/&quot;&gt;again&lt;/a&gt; and &lt;a href=&quot;https://www.uber.com/newsroom/security-update/&quot;&gt;again&lt;/a&gt;. Which is to say, these controls don’t work in reality.&lt;/p&gt;&lt;p&gt;We don’t live in a world where every application supports these standards. Critical apps like Stripe only support &lt;a href=&quot;https://support.stripe.com/questions/does-the-dashboard-support-login-via-sso-(single-sign-on)-or-saml&quot;&gt;SSO in beta&lt;/a&gt; and &lt;a href=&quot;https://docs.stripe.com/get-started/account/sso#limitations&quot;&gt;don’t support SCIM at all&lt;/a&gt;. Snowflake &lt;a href=&quot;https://www.snowflake.com/en/blog/multi-factor-identification-default/&quot;&gt;didn’t require MFA until recently&lt;/a&gt;. Forget social media accounts — getting to 100% coverage is nearly impossible when even security-conscious SaaS providers aren’t there yet.&lt;/p&gt;&lt;p&gt;SaaS has grown faster than we can secure it and very much &lt;a href=&quot;https://www.jpmorgan.com/technology/technology-blog/open-letter-to-our-suppliers&quot;&gt;became the default&lt;/a&gt;. Your network and endpoints still matter, but identity has become the primary security perimeter — the only control point that touches every application your employees use.&lt;/p&gt;&lt;h2&gt;Identity should be &lt;em&gt;simpler&lt;/em&gt;&lt;/h2&gt;&lt;p&gt;You can’t simply apply SSO, SCIM, and MFA everywhere because not every vendor supports these standards equally, or at all. We live in a world of bandaids — that’s why we have password managers.&lt;/p&gt;&lt;p&gt;Fragmentation makes coverage gaps worse: we end up with multiple ways to solve the same problem, none of them particularly well established. Should authorization be enforced through SSO or SCIM provisioning? There’s no central place to manage authentication and authorization, and since they’re tightly coupled, you can’t treat them as separate problems — yet every vendor implements them differently. This forces every organization to build more identity and security tools, such as proxies to access GitHub or jump hosts to authenticate users via SSH, because the underlying identity systems can’t work consistently across all applications. You need tools to discover which services employees signed up for, and more tools to audit which permissions they have in those services.&lt;/p&gt;&lt;p&gt;We’re not trying to discourage you from implementing reasonable identity controls: you can, and should, get critical production apps behind SSO and MFA, using a proxy if needed to meet those requirements. This is still the right goal — it’s valuable, necessary work that belongs on your IAM roadmap. But you also can’t escape the treadmill of slowly forcing every new SaaS application you buy to meet the same requirements. Even organizations with internal requirements to only purchase SaaS apps at the tier that includes SSO (damn that &lt;a href=&quot;https://sso.tax/&quot;&gt;SSO tax&lt;/a&gt;!) can’t get full compliance.&lt;/p&gt;&lt;h2&gt;Access controls are constantly changing, because organizations are constantly changing&lt;/h2&gt;&lt;p&gt;Identity is hard to get “right” because it’s a moving target that never stops moving.&lt;/p&gt;&lt;p&gt;Not only do access controls need to change often, they need to change quickly. Sure, you should remove unused access (but who does?) and revoke access when employees leave, but it’s much more important to ensure employees have access when they need it. No one wants multiple back-and-forths with IT to get the access they need to do their job. This wouldn’t be so difficult except that this change is &lt;em&gt;constant&lt;/em&gt;: there’s always another ticket, another role change, another exception.&lt;/p&gt;&lt;p&gt;Identity is gardening, not engineering. There’s constant weeding and replanting. It’s highly manual work that doesn’t scale the way other security domains do.&lt;/p&gt;&lt;p&gt;Making it scale requires acknowledging what makes it unique. You can’t write generic policies that work for every organization because every organization is unique. Not just different tech stacks, but different organizational structures. A startup’s flat hierarchy needs different controls than a regulated enterprise with strict separation of duties. People and culture are what make an organization what it is, and this business context &lt;em&gt;must&lt;/em&gt; influence how we think about security.&lt;/p&gt;&lt;h2&gt;Access decisions require context that security teams don’t have&lt;/h2&gt;&lt;p&gt;Most organizations want to follow the principle of least privilege: that employees should only have access to what they need. But how do you determine that? &lt;em&gt;We don’t know what someone should have access to.&lt;/em&gt; IT and security teams lack the business context for these decisions: What is this app? Is it sensitive? What kinds of users should have access to it? This context exists primarily in people’s heads, and if documented, it’s in an unmaintained spreadsheet. Context decays quickly.&lt;/p&gt;&lt;p&gt;When you don’t know what access users need, you make a mess. Users over-request access to unblock themselves. IT copies someone else’s permissions hoping it works. These incremental changes accumulate into inconsistent controls that never get cleaned up. You end up with hundreds of unused groups and permissions that made sense three reorganizations ago — and no one knows what’s actually needed anymore.&lt;/p&gt;&lt;p&gt;When you get provisioning wrong initially, you spend time cleaning it up later. Debugging why someone’s access works or doesn’t is nearly impossible. You fear removing access and breaking the business — especially when granting access again requires another ticket and hours of waiting. The industry has invented whole product categories to address this, like IGA for user access reviews required by compliance and ISPM for identifying IAM misconfigurations. But these are symptoms of the underlying problem: that we don’t have a systematic way to maintain access controls as organizations evolve.&lt;/p&gt;&lt;h2&gt;Building something better&lt;/h2&gt;&lt;p&gt;There’s no shortage of identity pain points. Even basic controls like SSO and MFA are nearly impossible to implement everywhere. Access requirements constantly change, making them difficult to maintain. And identity decisions require business context that security teams rarely have.&lt;/p&gt;&lt;p&gt;These aren’t just theoretical problems. They have real, measurable impact on people’s daily work. IT teams spend countless hours on access request tickets that could be automated. Employees get frustrated when they can’t access the tools they need to do their jobs. Security teams burn out from the constant manual work of managing exceptions and reviewing access that may or may not be appropriate.&lt;/p&gt;&lt;p&gt;When we founded &lt;a href=&quot;https://oblique.security/&quot;&gt;Oblique&lt;/a&gt;, we set out to work on real, tactical, impactful security problems that aren’t disappearing anytime soon — and that’s what we’re doing. Identity is a mess right now, but identity was also a mess before, and it’ll probably keep being a mess.&lt;/p&gt;&lt;p&gt;We’re tackling authorization for corporate environments first. It’s not trendy, but we know it’s impactful, because it’s &lt;a href=&quot;https://mayakaczorowski.com/blogs/what-sucks-in-security&quot;&gt;what we heard security leaders complain about most&lt;/a&gt;. Your IT and security teams want their access controls to be more maintainable. We’re here to help you get there.&lt;/p&gt;</content:encoded><author>Maya Kaczorowski</author></item></channel></rss>