WebLogic: Identity Federation and Certificates

WebLogic: Identity Federation and Certificates
Photo by Agence Olloweb / Unsplash

A recent system upgrade turned my whole day into a hide-and-seek playground. After the WebLogic domain migration, everything worked and looked healthy except the primary application. Accessing it results in a 5xx server error without any plain reason.

The most reasonable explanation is that something is wrong with the WebLogic domain Federation Services (Here is a great walkthrough, if you need a Single Sign-On.) Yet all SAML 2.0 components were configured, and system services are available. Now, let's walk through the generic root-cause analysis steps together.

  1. Ensure that your system receives client requests. The frontend web server logs indicate successful request processing, and server replays if Web SSO (SAMLv2) does not protect the server resource. That means the web server is fine.
  2. Identity Provider engineers confirmed that no requests have been registered on the provider side since the old system shut down.
  3. If your application server logs are clear, but your application/system does not work, you don't look deep enough. Combining this fact with the observation from #1-2 means I need better visibility into the SAML2 security subsystem. The WebLogic debug subsystem is one method that allows you to do that without a system restart. Select a managed server, then the Debug tab. In the submodule tree list weblogic -> security, mark the atn, atz, and saml2 modules and click the "Enable" button. Activate the changes and watch server logs for additional details.
Sample subsystem debug list.
  1. With the new logging level, I caught an exception "com.bea.security.saml2.service.SAML2Exception Sign authn request error." At this moment, I know that:
    1. The WebLogic receives requests and serves them if the SAMLv2 subsystem is not involved
    2. It fails on the authentication request signing; it's not much, but it points to the certificate used for signing and encrypting.
    3. The certificates and keystores were the only new part in the configuration equation.
  2. Close examination of the old and new certificates gave me another piece of the puzzle: The new certificates are Elliptic Curve Cryptography (ECC) keys, while the old ones have a good old RSA key. A quick internet search brought me another clue from the Microsoft knowledge base. Although the Q&A exchange was old, the WebLogic implementation is not much younger.
  3. Since you must use a single keystore per managed server, the only obvious solution is to import the RSA key and certificate to the same key store with a new alias and update the SAMLv2 configuration with the new one.

Now we have a reasonable explanation and the mitigation plan. Let's try our hypothesis:

  1. Transfer the old keystore (i.e., original.jks) with RSA key to the server.
  2. Back up the current keystore (let's name it primary.jks) to simplify rollback.
  3. Import the original key to the current keystore using Java's keytool utility. Keep keystores and key passwords handy since keytool will need them.
$ $JAVA_HOME/bin/keytool -importkeystore -srckeystore /tmp/original.jks \
-destkeystore /opt/oracle/app/keystores/primary.jks -srcalias webapp -destalias saml2
$

Transfer the RSA key into the WebLogic keystore

  1. Update the SAML2 configuration with the new alias and activate changes.
  2. Your application is available, and the identity provider receives your authentication requests.
💡
Normally, you should restart WebLogic server to activate keystore and trust store changes, but SAML will read it directly and would use it right away.

A few takeaways:

  • Make sure that your system is the source of the trouble.
  • WebLogic 12c and Azure AD do not support ECC certificates for signing and encryption. I didn't test them for EntraID or WebLogic 14c, yet I will not be surprised if they don't.
  • When you migrate WebLogic with the Federation Services enabled, keep your previous certificates to maintain configuration compatibility - use a separate alias with a meaningful name.
  • Don't forget to restore the logging level for the WebLogic server to avoid log bloating.