{"id":2406789,"date":"2023-12-01T19:58:00","date_gmt":"2023-12-02T00:58:00","guid":{"rendered":"https:\/\/platoaistream.net\/plato-data\/building-a-rag-pipeline-for-semi-structured-data-with-langchain\/"},"modified":"2023-12-01T19:58:00","modified_gmt":"2023-12-02T00:58:00","slug":"building-a-rag-pipeline-for-semi-structured-data-with-langchain","status":"publish","type":"station","link":"https:\/\/platoaistream.net\/plato-data\/building-a-rag-pipeline-for-semi-structured-data-with-langchain\/","title":{"rendered":"Building A RAG Pipeline for Semi-structured Data with Langchain"},"content":{"rendered":"<h2 class=\"wp-block-heading\" id=\"h-introduction\">Introduction<\/h2>\n<p>Retrieval Augmented Generation has been here for a while. Many tools and applications are being built around this concept, like vector stores, retrieval frameworks, and LLMs, making it convenient to work with custom documents, especially Semi-structured Data with Langchain. Working with long, dense texts has never been so easy and fun. The conventional <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/www.analyticsvidhya.com\/blog\/2023\/09\/retrieval-augmented-generation-rag-in-ai\/\">RAG<\/a> works well with unstructured text-heavy files like DOC, PDFs, etc. However, this approach does not sit well with semi-structured data, such as embedded tables in PDFs.<\/p>\n<p>While working with semi-structured data, there are usually two concerns.<\/p>\n<ul>\n<li>The conventional extraction and text-splitting methods do not account for tables in PDFs. They usually end up breaking up the tables. Hence resulting in information loss.<\/li>\n<li>Embedding tables may not translate to precise semantic search.<\/li>\n<\/ul>\n<p>So, in this article, we will build a Retrieval generation pipeline for semi-structured data with Langchain to address these two concerns with semistructured data.<\/p>\n<h4 class=\"wp-block-heading\" id=\"h-learning-objectives\">Learning Objectives<\/h4>\n<ul>\n<li>Understand the difference between structured, unstructured, and semi-structured data.<\/li>\n<li>A mild refresher on Retrieval Augement Generation and Langchain.<\/li>\n<li>Learn how to build a multi-vector retriever to handle semi-structured data with Langchain.<\/li>\n<\/ul>\n<p><em><strong>This article was published as a part of the&nbsp;<a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/analyticsvidhya.com\/blogathon\">Data Science Blogathon<\/a>.<\/strong><\/em><\/p>\n<div class=\"wp-block-yoast-seo-table-of-contents yoast-table-of-contents\">\n<h2>Table of contents<\/h2>\n<\/div>\n<h2 class=\"wp-block-heading\" id=\"h-types-of-data\">Types of Data<\/h2>\n<p>There are usually three types of data. Structured, Semi-structured, and Unstructured.<\/p>\n<ul>\n<li><strong>Structured Data<\/strong>: The structured data is the standardized data. The data follows a pre-defined schema, such as rows and columns. SQL databases, Spreadsheets, data frames, etc.<\/li>\n<li><strong>Unstructured Data<\/strong>: Unstructured data, unlike structured data, follows no data model. The data is as random as it can get. For example, PDFs, Texts, Images, etc.<\/li>\n<li><strong>Semi-structured Data<\/strong>: It is the combination of the former data types. Unlike the structured data, it does not have a rigid pre-defined schema. However, the data still maintains a hierarchical order based on some markers, which is in contrast to unstructured types. For example, CSVs, HTML, Embedded tables in PDFs, XMLs, etc.<\/li>\n<\/ul>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/platoaistream.net\/wp-content\/uploads\/2023\/12\/building-a-rag-pipeline-for-semi-structured-data-with-langchain.webp\" alt=\"Semi-structured Data with Langchain\"><\/figure>\n<h2 class=\"wp-block-heading\" id=\"h-what-is-rag\">What is RAG?<\/h2>\n<p>RAG stands for Retrieval Augmented Generation. It is the simplest way to feed the Large language models with novel information. So, let\u2019s have a quick primer on RAG.<\/p>\n<p>In a typical RAG pipeline, we have knowledge sources, such as local files, Web pages, databases, etc, an embedding model, a vector database, and an LLM. We collect the data from various sources, split the documents, get the embeddings of text chunks, and store them in a vector database. Now, we pass the embeddings of queries to the vector store, retrieve the documents from the vector store, and finally generate answers with the LLM.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/platoaistream.net\/wp-content\/uploads\/2023\/12\/building-a-rag-pipeline-for-semi-structured-data-with-langchain-1.webp\" alt=\"What is RAG? Semi-structured Data with Langchain\"><\/figure>\n<p>This is a workflow of a conventional RAG and works well with unstructured data like texts. However, when it comes to semi-structured data, for example, embedded tables in a PDF, it often fails to perform well. In this article, we will learn how to handle these embedded tables.<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-what-is-langchain\">What is Langchain?<\/h2>\n<p>The Langchain is an open-source framework for building LLM-based applications. Since its launch, the project has garnered wide adoption among software developers. It provides a unified range of tools and technologies to build AI applications faster. Langchain houses tools such as vector stores, document loaders, retrievers, embedding models, text splitters, etc. It is a one-stop solution for building AI applications. But there is two core value proposition that makes it stand apart.<\/p>\n<ul>\n<li><b>LLM chains<\/b>: Langchain provides multiple chains. These chains chain together several tools to accomplish a single task. For example, ConversationalRetrievalChain chains together an LLM, Vector store retriever, embedding model, and a chat history object to generate responses for a query. The tools are hard coded and have to be defined explicitly.<\/li>\n<li><b>LLM agents<\/b>: Unlike LLM chains, AI agents do not have hard-coded tools. Instead of chaining one tool after another, we let the LLM decide which one to select and when based on text descriptions of tools. This makes it ideal for building complex LLM applications involving reasoning and decision-making.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\" id=\"h-building-the-rag-pipeline\">Building The RAG pipeline<\/h2>\n<p>Now that we have a primer on the concepts. Let\u2019s discuss the approach to building the pipeline. Working with semi-structured data can be tricky as it does not follow a conventional schema for storing information. And to work with unstructured data, we need specialized tools tailor-made for extracting information. So, in this project, we will use one such tool called \u201cunstructured\u201d; it is an open-source tool for extracting information from different unstructured data formats, such as tables in PDFs, HTML, XML, etc. Unstructured uses Tesseract and Poppler under the hood to process multiple data formats in files. So, let\u2019s set up our environment and install dependencies before diving into the coding part.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/platoaistream.net\/wp-content\/uploads\/2023\/12\/building-a-rag-pipeline-for-semi-structured-data-with-langchain-2.webp\" alt=\"Building the RAG Pipeline | Semi-structured Data with Langchain\"><\/figure>\n<h4 class=\"wp-block-heading\" id=\"h-set-up-dev-env\">Set-up Dev Env<\/h4>\n<p>Like any other Python project, open a Python environment and install Poppler and Tesseract.<\/p>\n<pre class=\"wp-block-code\"><code>!sudo apt install tesseract-ocr\n!sudo apt-get install poppler-utils<\/code><\/pre>\n<p>Now, install the dependencies that we will need in our project.<\/p>\n<pre class=\"wp-block-code\"><code>!pip install \"unstructured[all-docs]\" Langchain openai<\/code><\/pre>\n<p>Now that we have installed the dependencies, we will extract data from a PDF file.<\/p>\n<pre class=\"wp-block-code\"><code>from unstructured.partition.pdf import partition_pdf pdf_elements = partition_pdf( \"mistral7b.pdf\", chunking_strategy=\"by_title\", extract_images_in_pdf=True, max_characters=3000, new_after_n_chars=2800, combine_text_under_n_chars=2000, image_output_dir_path=\".\/\" )<\/code><\/pre>\n<p>Running it will install several dependencies like YOLOx that are needed for OCR and return object types based on extracted data. Enabling&nbsp;extract_images_in_pdf will let unstructured extract embedded images from files. This can help implement multi-modal solutions.<\/p>\n<p>Now, let\u2019s explore the categories of elements from our PDF.<\/p>\n<pre class=\"wp-block-code\"><code># Create a dictionary to store counts of each type\ncategory_counts = {} for element in pdf_elements: category = str(type(element)) if category in category_counts: category_counts[category] += 1 else: category_counts[category] = 1 # Unique_categories will have unique elements\nunique_categories = set(category_counts.keys())\ncategory_counts<\/code><\/pre>\n<p>Running this will output element categories and their count.<\/p>\n<p>Now, we separate the elements for easy handling. We create an Element type that inherits from Langchain\u2019s Document type. This is to ensure more organized data, which is easier to deal with.<\/p>\n<pre class=\"wp-block-code\"><code>from unstructured.documents.elements import CompositeElement, Table\nfrom langchain.schema import Document\nclass Element(Document): type: str # Categorize by type\ncategorized_elements = []\nfor element in pdf_elements: if isinstance(element, Table): categorized_elements.append(Element(type=\"table\", page_content=str(element))) elif isinstance(element, CompositeElement): categorized_elements.append(Element(type=\"text\", page_content=str(element))) # Tables\ntable_elements = [e for e in categorized_elements if e.type == \"table\"] # Text\ntext_elements = [e for e in categorized_elements if e.type == \"text\"]<\/code><\/pre>\n<h2 class=\"wp-block-heading\" id=\"h-multi-vector-retriever\">Multi-vector Retriever<\/h2>\n<p>We have table and text elements. Now, there are two ways we can handle these. We can store the raw elements in a document store or store summaries of texts. Tables might pose a challenge to semantic search; in that case, we create the summaries of tables and store them in a document store along with the raw tables. To achieve this, we will use MultiVectorRetriever. This retriever will manage a vector store where we store the embeddings of summary texts and a simple in-memory document store to store raw documents.<\/p>\n<p>First, build a summarizing chain to summarize the table and text data we extracted earlier.<\/p>\n<pre class=\"wp-block-code\"><code>from langchain.chat_models import cohere\nfrom langchain.prompts import ChatPromptTemplate\nfrom langchain.schema.output_parser import StrOutputParser prompt_text = \"\"\"You are an assistant tasked with summarizing tables and text. Give a concise summary of the table or text. Table or text chunk: {element} \"\"\"\nprompt = ChatPromptTemplate.from_template(prompt_text) model = cohere.ChatCohere(cohere_api_key=\"your_key\")\nsummarize_chain = {\"element\": lambda x: x} | prompt | model | StrOutputParser() tables = [i.page_content for i in table_elements]\ntable_summaries = summarize_chain.batch(tables, {\"max_concurrency\": 5}) texts = [i.page_content for i in text_elements]\ntext_summaries = summarize_chain.batch(texts, {\"max_concurrency\": 5})<\/code><\/pre>\n<p>I have used Cohere LLM for summarizing data; you may use OpenAI models like GPT-4. Better models will yield better outcomes. Sometimes, the models may not perfectly capture table details. So, it is better to use capable models.<\/p>\n<p>Now, we create the MultivectorRetriever.<\/p>\n<pre class=\"wp-block-code\"><code>from langchain.retrievers import MultiVectorRetriever\nfrom langchain.prompts import ChatPromptTemplate import uuid from langchain.embeddings import OpenAIEmbeddings\nfrom langchain.schema.document import Document\nfrom langchain.storage import InMemoryStore\nfrom langchain.vectorstores import Chroma # The vectorstore to use to index the child chunks\nvectorstore = Chroma(collection_name=\"collection\", embedding_function=OpenAIEmbeddings(openai_api_key=\"api_key\")) # The storage layer for the parent documents\nstore = InMemoryStore()\nid_key = \"\"id\" # The retriever\nretriever = MultiVectorRetriever( vectorstore=vectorstore, docstore=store, id_key=id_key,\n) # Add texts\ndoc_ids = [str(uuid.uuid4()) for _ in texts]\nsummary_texts = [ Document(page_content=s, metadata={id_key: doc_ids[i]}) for i, s in enumerate(text_summaries)\n]\nretriever.vectorstore.add_documents(summary_texts)\nretriever.docstore.mset(list(zip(doc_ids, texts))) # Add tables\ntable_ids = [str(uuid.uuid4()) for _ in tables]\nsummary_tables = [ Document(page_content=s, metadata={id_key: table_ids[i]}) for i, s in enumerate(table_summaries)\n]\nretriever.vectorstore.add_documents(summary_tables)\nretriever.docstore.mset(list(zip(table_ids, tables))) <\/code><\/pre>\n<p>We used Chroma vector store for storing summary embeddings of texts and tables and an in-memory document store to store raw data.<\/p>\n<h4 class=\"wp-block-heading\" id=\"h-rag\">RAG<\/h4>\n<p>Now that our retriever is ready, we can build an RAG pipeline using Langchain Expression Language.<\/p>\n<pre class=\"wp-block-code\"><code>from langchain.schema.runnable import RunnablePassthrough # Prompt template\ntemplate = \"\"\"Answer the question based only on the following context, which can include text and tables::\n{context}\nQuestion: {question} \"\"\"\nprompt = ChatPromptTemplate.from_template(template) # LLM\nmodel = ChatOpenAI(temperature=0.0, openai_api_key=\"api_key\") # RAG pipeline\nchain = ( {\"context\": retriever, \"question\": RunnablePassthrough()} | prompt | model | StrOutputParser()\n)\n<\/code><\/pre>\n<p>Now, we can ask questions and receive answers based on retrieved embeddings from the vector store.<\/p>\n<pre class=\"wp-block-code\"><code>chain.invoke(input = \"What is the MT bench score of Llama 2 and Mistral 7B Instruct??\")<\/code><\/pre>\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n<p>A lot of information stays hidden in semi-structured data format. And it is challenging to extract and perform conventional RAG on these data. In this article, we went from extracting texts and embedded tables in the PDF to building a multi-vector retriever and RAG pipeline with Langchain. So, here are the key takeaways from the article.<\/p>\n<h4 class=\"wp-block-heading\" id=\"h-key-takeaways\">Key Takeaways<\/h4>\n<ul>\n<li>Conventional RAG often faces challenges dealing with semi-structured data, such as breaking up tables during text splitting and imprecise semantic searches.<\/li>\n<li>Unstructured, an open-source tool for semi-structured data, can extract embedded tables from PDFs or similar semi-structured data.<\/li>\n<li>With Langchain, we can build a multi-vector retriever for storing tables, texts, and summaries in document stores for better semantic search.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\" id=\"h-frequently-asked-questions\">Frequently Asked Questions<\/h2>\n<div class=\"schema-faq wp-block-yoast-faq-block\">\n<div class=\"schema-faq-section\" id=\"faq-question-1701435127407\"><strong class=\"schema-faq-question\">Q1. What is semi-structured data?<\/strong> <\/p>\n<p class=\"schema-faq-answer\">A: Semi-structured data, unlike structured data, does not have a rigid schema but has other forms of markers to enforce hierarchies.<\/p>\n<\/p><\/div>\n<div class=\"schema-faq-section\" id=\"faq-question-1701435149965\"><strong class=\"schema-faq-question\">Q2. What are some examples of semi-structured data?<\/strong> <\/p>\n<p class=\"schema-faq-answer\">A. Semi-structured data examples are CSV, Emails, HTML, XML, parquet files, etc.<\/p>\n<\/p><\/div>\n<div class=\"schema-faq-section\" id=\"faq-question-1701435177052\"><strong class=\"schema-faq-question\">Q3. What is Langchain used for?<\/strong> <\/p>\n<p class=\"schema-faq-answer\">A. LangChain is an open-source framework that simplifies the creation of applications using large language models. It can be used for various tasks, including chatbots, RAG, question-answering, and generative tasks.<\/p>\n<\/p><\/div>\n<div class=\"schema-faq-section\" id=\"faq-question-1701435193982\"><strong class=\"schema-faq-question\">Q4. What is a RAG pipeline?<\/strong> <\/p>\n<p class=\"schema-faq-answer\">A. A RAG pipeline retrieves documents from external data stores, processes them to store them in a knowledge base, and provides tools to query them.<\/p>\n<\/p><\/div>\n<div class=\"schema-faq-section\" id=\"faq-question-1701435211966\"><strong class=\"schema-faq-question\">Q5. What is the difference between the Langchain and Llama Index?<\/strong> <\/p>\n<p class=\"schema-faq-answer\">A. Llama Index explicitly designs search and retrieval applications, while Langchain offers flexibility for creating custom AI agents.<\/p>\n<\/p><\/div><\/div>\n<p><b>The media shown in this article is not owned by Analytics Vidhya and is used at the Author\u2019s discretion.<span class=\"Apple-converted-space\">&nbsp;<\/span><\/b><\/p>\n<p><h3 class=\"jp-relatedposts-headline\"><em>Related<\/em><\/h3>\n<\/p>\n<ul class=\"plato-post-bottom-links\">\n<li class=\"plato-post-bottom-link-amplifi\">SEO Powered Content &amp; PR Distribution. <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/www.amplifipr.com\">Get Amplified Today.<\/a><\/li>\n<li class=\"plato-post-bottom-link-platodata-network\">PlatoData.Network Vertical Generative Ai. Empower Yourself. <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platodata.network\">Access Here.<\/a><\/li>\n<li class=\"plato-post-bottom-link-platoaistream\">PlatoAiStream. Web3 Intelligence. Knowledge Amplified. <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoaistream.com\">Access Here.<\/a><\/li>\n<li class=\"plato-post-bottom-link-platoesg\">PlatoESG. <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/carbon\/\">Carbon,<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/cleantech\/\">CleanTech,<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/energy\/\">Energy,<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/environment\/\">Environment,<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/solar\/\">Solar,<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\/aiwire\/waste-management\/\">Waste Management.<\/a> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platoesg.com\">Access Here.<\/a><\/li>\n<li class=\"plato-post-bottom-link-platohealth\">PlatoHealth. Biotech and Clinical Trials Intelligence. <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/platohealth.ai\">Access Here.<\/a><\/li>\n<li class=\"plato-post-bottom-link-source\"><span>Source:<\/span> <a rel=\"noopener\" target=\"_blank\" href=\"https:\/\/www.analyticsvidhya.com\/blog\/2023\/12\/building-a-rag-pipeline-for-semi-structured-data-with-langchain\/\">https:\/\/www.analyticsvidhya.com\/blog\/2023\/12\/building-a-rag-pipeline-for-semi-structured-data-with-langchain\/<\/a><\/li>\n<\/ul>\n","protected":false},"author":1,"featured_media":2406790,"template":"Default","meta":{"_eb_attr":"","type":"","auto_type":false,"post":"","stream":"","stream_url":"","waveform_data":[],"duration":0,"bpm":0,"downloadable":false,"download_url":"","purchase_title":"","purchase_url":"","post-count-all":0,"like_count":0,"download_count":0,"editor_note":"","copyright":"","captions":[]},"genre":[42022],"artist":[42023],"mood":[],"activity":[],"station_tag":[21246,38012,33649,38013,34957,21579,10944,51618,51592,54118,4262,13775,47443,4263,12626,51603,9085,3680,39830,4133,4043,51052,3787,51121,4046,43445,18340,9773,12454,24340,51050,12148,3681,9617,6558,48636,3718,4135,48646,57715,54428,5414,48637,9222,40024,30993,48638,51045,12619,9224,14383,13655,14725,4526,4244,40004,51189,48639,24510,48652,9239,11999,11501,14043,40213,10957,10947,40129,4146,9086,11483,16255,6010,11499,51625,4963,10623,8890,17500,9272,51600,9226,10146,40297,36222,54857,40590,34284,10738,40570,10315,11928,4152,11002,11006,3642,5226,6119,4155,4156,48666,5236,47632,55649,56499,9578,12134,5871,3896,13670,9163,11538,9090,59113,51263,34237,4531,51607,12346,48657,14017,40053,40011,48492,48659,11824,52783,20661,12360,51631,51601,51373,39835,4250,9165,12620,452,9166,12390,59647,11100,52844,39878,40638,43582,4593,35493,10369,30957,18806,4490,4067,4069,9874,12646,48640,5175,58331,40112,9570,4070,13103,13776,4005,59137,9007,10844,11204,40035,34012,12472,4076,22864,48653,9238,51599,3732,39882,4313,21664,4567,3694,4185,12342,18595,3650,48660,48663,12756,51458,13353,9267,22734,19265,10371,3953,3695,3908,11381,11192,47486,40062,51123,53061,3653,48648,3805,4597,4572,3806,3734,12834,4318,54426,48658,34918,4279,3851,21292,29070,40144,11042,3737,11796,39875,26118,4353,12471,12343,53775,4094,9413,10997,6100,11457,41121,39917,44248,52156,51595,39918,51124,33849,16658,48641,4096,51591,48642,17454,51125,4099,4207,15633,51134,3662,10787,3663,51048,52000,48887,10683,54854,12344,50716,4857,56747,40330,27195,7457,8453,9642,55070,48150,43670,9221,9364,4020,60562,9730,9057,14213,15313,50714,9248,28968,12192,42598,4216,5492,44076,16559,10274,51982,13606,12142,46242,48649,4118,49737,3811,57873,60469,47321,3976,12941,12345,9232,4461,60470,12444,13654,52967,3779,4282,37041,10951,4223,11458,58969,31405,9268,4860,44550,13280,59359,9979,13751,4901,4032,4261,11508,39843,40835,40349,40914,11525,40350,10316,4229,34672,16659,57757,48650,13215,39845,48662,28675,48651,51191,51129,13569,19267,9084,10956,8813,41125,51830,51612,12348,40078,13074,46246,8955,51987,10367,51049,53283,51195,9177,3782,11001,51130,23339,13218,51131,4590,31564,48656,52558,9178,4368,51359,36520,11454,40230,27168,48644,3935,43538,9608,4240,4927,5311,4370,51133,14555],"_links":{"self":[{"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/station\/2406789"}],"collection":[{"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/station"}],"about":[{"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/types\/station"}],"author":[{"embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/users\/1"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/media\/2406790"}],"wp:attachment":[{"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/media?parent=2406789"}],"wp:term":[{"taxonomy":"genre","embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/genre?post=2406789"},{"taxonomy":"artist","embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/artist?post=2406789"},{"taxonomy":"mood","embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/mood?post=2406789"},{"taxonomy":"activity","embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/activity?post=2406789"},{"taxonomy":"station_tag","embeddable":true,"href":"https:\/\/platoaistream.net\/wp-json\/wp\/v2\/station_tag?post=2406789"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}