Guest Min Huang Posted May 27, 2024 Posted May 27, 2024 When developers read API reference, they sometimes have a need or desire to review the corresponding source code. Until recently, the .NET API reference docs did not provide a link back to the source code, prompting calls from the community for this addition. In response to this feedback, we are happy to announce links connecting docs to the source code are now available on most of our popular .NET APIs. In this blog post, we will share details about how we added the links to the docs experience and how we made use of existing APIs to deliver this improvement. [HEADING=1]Live examples of the links[/HEADING] Before going into implementation details, we would like to showcase where the docs have changed. For .NET APIs that meet our required criteria (having Source Link enabled, having accessible PDB, and being hosted in a public repository), the links are included in the [iCODE]Definition metadata[/iCODE]. The following image from the [iCODE]String[/iCODE] class demonstrates the placement of this new link: In cases where overloads are present, the links are included below the overload title. The following image of [iCODE]String.IndexOf[/iCODE] method demonstrates this pattern: [HEADING=1]How do we build the links?[/HEADING] The .NET reference docs pipeline operates on a set of DLL files and NuGet packages. These are processed by a variety of tools to transform their contents into the HTML pages displayed on Microsoft Learn. Correctly building the links to source requires an understanding of the relationship between source, binaries, and GitHub, and how to tie them together with some existing .NET APIs. In discussing our goal to surface links to source with developers from the .NET and Roslyn teams, it became clear that our requirement was closely aligned with Visual Studio’s Go to definition functionality. With this understanding and the extensive details of [iCODE]Go to definition[/iCODE] provided by @davidwengier in Go To Definition improvements for external source in Roslyn, we were able to apply a similar approach to build links to source for the docs. [HEADING=2]Source Link[/HEADING] Source Link is a technology that enables .NET developers to debug the source code of assemblies referenced by their applications. Though originally intended for source debugging, Source Link is perfectly adaptable to our scenario. Every .NET project which enabled Source Link will generate a mapping from a relative folder path to an absolute repository URL in PDB (Program Database). This is as described in the Go To Definition improvements for external source in Roslyn blog post by @davidwengier. To view the [iCODE]Source Link[/iCODE] entry, you can open the DLL using dotPeek or ILSpy. The following screenshot shows an example accessing the [iCODE]Source Link[/iCODE] entry of [iCODE]System.Private.CoreLib[/iCODE] with dotPeek by navigating to [iCODE]Portable PDB Metadata[/iCODE] then the [iCODE]CustomDebugInformation[/iCODE] table: [!NOTE] To find out the metadata definition about Source Link, go to: PortablePdb-Metadata. [HEADING=2]Building the links[/HEADING] Now we know we have an overall mapping stored in Source Link entry, the next question is how we build a unique link for each type/member in this DLL? For example, the link we built for [iCODE]String.Clone[/iCODE] method is: runtime/src/libraries/System.Private.CoreLib/src/System/String.cs at 5535e31a712343a63f5d7d796cd874e563e5ac14 · dotnet/runtime This link can be split into 3 parts: The first part [iCODE]https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14[/iCODE] is parsed from Source Link mapping json and is bound to a specific repository commit. The second part [iCODE]src/libraries/System.Private.CoreLib/src/System/String.cs[/iCODE] can be found in [iCODE]Document[/iCODE] table of the PDB. And the last part [iCODE]#L388C13-L388C25[/iCODE] is built from [iCODE]SequencePoints[/iCODE] column of [iCODE]MethodDebugInformation[/iCODE] table. [iCODE]SequencePoints[/iCODE] blob will map a range of IL instructions in this method block back to the line numbers of its original source code as demonstrated in below screenshot. For more details, go to SequencePoints Metadata definition. We use System.Reflection.Metadata library to iterate all the types/members in this DLL and then match the records in [iCODE]MethodDebugInformation[/iCODE] table to build the final links. var mdReader = peReader.GetMetadataReader(); foreach(var typeDefHandle in mdReader.TypeDefinitions) { var typeDef = mdReader.GetTypeDefinition(typeDefHandle); string typeName = mdReader.GetString(typeDef.Name); string ns = mdReader.GetString(typeDef.Namespace); string fullName = String.IsNullOrEmpty(ns) ? typeName : $"{ns}.{typeName}"; Console.WriteLine(fullName); foreach (var document in debugReader.FindSourceDocuments(typeDefHandle)) { Console.WriteLine($" {document.SourceLinkUrl}"); } } The implementation can also be found in Roslyn DocumentDebugInfoReader.cs and SymbolSourceDocumentFinder.cs. [HEADING=2]Finding the PDB file[/HEADING] Since we know the link’s information is available in the PDB, our next step is to locate these PDBs for our use. Currently given a DLL, we will look for 3 places to locate the corresponding PDB: Embedded PDB. If [iCODE]<DebugType>[/iCODE]embedded[iCODE]</DebugType>[/iCODE] is specified in your csproj, the PDB file will be embedded in this DLL. PDB on the disk. You can put your PDB right next to your DLL. Microsoft Symbol Server. There is a public symbol server where we can download the PDB for the DLL. See the implementation in Roslyn PdbFileLocatorService.cs. [HEADING=2]Finding the correct PDB version[/HEADING] We would like to talk a little more about how we download the correct version of PDB for a given DLL from Microsoft Symbol Server. Below is a sample PDB download URL and with its format defined in portable-pdb-signature. http://msdl.microsoft.com/download/symbols/System.Private.CoreLib.pdb/8402667829752b9d0b00ebbc1d5a66d9FFFFFFFF/System.Private.CoreLib.pdb From the URL pattern we can observe we need to provide the PDB file name [iCODE]System.Private.CoreLib.pdb[/iCODE] and a GUID [iCODE]8402667829752b9d0b00ebbc1d5a66d9FFFFFFFF[/iCODE]. So the question is where can we find this information? Previously we used dotPeek to open a DLL to look for the [iCODE]Source Link[/iCODE] entry. Now we can open it again and check the [iCODE]Metadata[/iCODE] section. In the above screenshot, we can find this GUID in the [iCODE]Debug Directory[/iCODE] and the entry must be a portable code view entry. The [iCODE]Path[/iCODE] attribute of this entry stands for the path to the PDB file which we can get the file name from it. foreach (var entry in peReader.ReadDebugDirectory()) { if (entry.Type == DebugDirectoryEntryType.CodeView && entry.IsPortableCodeView) { var codeViewEntry = peReader.ReadCodeViewDebugDirectoryData(entry); var pdbName = Path.GetFileName(codeViewEntry.Path); var codeViewEntryGuid = $"{codeViewEntry.Guid.ToString("N").ToUpper()}FFFFFFFF"; return $"{MsftSymbolServerUrl}/{pdbName}/{codeViewEntryGuid}/{pdbName}"; } } [HEADING=2]Finding the DLL file[/HEADING] As mentioned earlier, our .NET reference docs pipeline operates on a collection of DLL files or NuGet packages. For some assemblies though we needed to get creative producing the links to source. Here are two situations we needed to develop workarounds for: Reference Assembly. For example, DLLs in this package Microsoft.NETCore.App.Ref. Reference assemblies don’t have PDBs uploaded to the symbol server which preventing us from generating the links to source. Our current solution is to download the Runtime package and use the assemblies there to download the matched PDBs. Source embedded in PDB. For example, package System.Threading.AccessControl has source being generated at build time into the [iCODE]obj[/iCODE] folder. This doesn’t help us link to the source code, so instead of using the DLL in [iCODE]lib[/iCODE] folder we will also look for DLL with the same name in [iCODE]runtimes[/iCODE] folder. [HEADING=2]Consuming the links in the docs pipeline[/HEADING] Once we find the correct DLL/PDB files and successfully build the links to source, we save this information as a JSON file in the target docs GitHub repo. To understand how we will use this information, we need to revisit the .NET reference docs pipeline. The pipeline creates an XML file for each unique type, which our build system later converts into an HTML page that is presented on Microsoft Learn. To map an API in the XML to its corresponding links to source found in the JSON file we use the unique identifier [iCODE]DocId[/iCODE]. This value is present in both the XML ([iCODE]DocId[/iCODE]) and the JSON ([iCODE]DocsId[/iCODE]). For example, the [iCODE]DocId[/iCODE] for [iCODE]System.String[/iCODE] is [iCODE]T:System.String[/iCODE]. This [iCODE]DocId[/iCODE] value will be used to locate the link to source within the System.Private.CoreLib.json file (for its corresponding version). "DocsId": "T:System.String", "SourceLink": "https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/String.cs" To know about how to generate a [iCODE]DocId[/iCODE], see DocCommentId.cs or DocumentationCommentId.cs. [HEADING=2]Known limitations[/HEADING] In our current implementation we are aware of a few limitations: For types with no document info recorded in PDB such as enums or interfaces, a new GUID TypeDefinitionDocuments was introduced in [iCODE]CustomDebugInformation[/iCODE] table to solve this problem. However this information will be trimmed sometimes for some DLLs and makes us unable to produce the links. See the bug details here No TypeDefinitionDocuments entries compiled for enums/interfaces etc in pdb · Issue #100051 · dotnet/runtime. For class members which are defined without a body (e.g. extern or abstract), there is no line information (SequencePoints) included in the PDB. Because of this, we are unable to direct to a span range and instead direct to the entire file. A future improvement is planned to address this. [HEADING=2]Another idea for improvement[/HEADING] As you may have noticed, we shared a lot of core logic with [iCODE]Go to definition[/iCODE]. In fact, we reused a couple of their classes in our implementation. A proposed feature we have to improve the process is to modify Roslyn with existing code to generate a source mapping at the type/member level for us to consume. If the community shares the same requirement, please comment to vote for us. Thanks! [HEADING=2]Give us your feedback[/HEADING] We would love to get your feedback on using the links so please let us know what you think! And if you find any issue related to the links, don’t hesitate to share using the feedback controls or open a GitHub issue on the related docs repo. [HEADING=2]Lastly, acknowledgments[/HEADING] I want to share thanks to my colleague @shiminxu for his contribution to this project. Also thanks to @ericstj from .NET team and @tmat from Roslyn team for the technical guidance. And finally thanks to the countless others who contributed to make this change possible. The post Introducing links to source code for .NET API Docs appeared first on .NET Blog. Continue reading... Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.