如何搭建PDF文档流服务器

文章介绍了如何使用PDF.js的RangeRequest特性来实现电子书的流式服务,防止他人随意下载。通过NodeJS服务端处理Range请求,按需加载PDF页面,以及在React中使用ReactPDFviewer组件展示。同时提到了Golang的http.ServeFile对RangeRequest的支持,以及C#中处理PDF加载的方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

前言

自己制作的电子书,不想让别人随意下载,于是考虑采用流式服务,边看边下。考察了一下网上的解决方案,整理出来。

PDF.js

PDF.js supports some features to deal with long documents with many pages. For example, it supports document streaming via byte range requests. This enables documents previously optimized for fast web view to display for the user almost instantly when served via a URL – without having to wait on the entire file to download first.

PDFTron’s

You can load the PDF document stream in PDF Viewer from server-side using the Load() API in the WebAPI controller. Refer to the following code.

可以借助PDFJS.disableAutoFetch = true,一次获取几个页面,按需加载。

pdfjsLib.getDocument({ url: ‘http://www.xxx.yy/zz.pdf’, disableAutoFetch: true, disableStream: true}).then(function(pdf)

实现Range Request

在这里插入图片描述

采用NodeJS作为服务端

const http = require("http");
const port = process.env.PORT || 3000;

const { stat, createReadStream } = require("fs");
const { promisify } = require("util");
const { pipeline } = require("stream");
const samplePDF = "./demo.pdf";
const fileInfo = promisify(stat);

http
  .createServer(async (req, res) => {

    /** Calculate Size of file */
    const { size } = await fileInfo(samplePDF);
    const range = req.headers.range;
    console.log(size)

    /** Check for Range header */
    if (range) {
      /** Extracting Start and End value from Range Hea der */
      let [start, end] = range.replace(/bytes=/, "").split("-");
      start = parseInt(start, 10);
      end = end ? parseInt(end, 10) : size - 1;

      if (!isNaN(start) && isNaN(end)) {
        start = start;
        end = size - 1;
      }
      if (isNaN(start) && !isNaN(end)) {
        start = size - end;
        end = size - 1;
      }

      // Handle unavailable range request
      if (start >= size || end >= size) {
        // Return the 416 Range Not Satisfiable.
        res.writeHead(416, {
          "Content-Range": `bytes */${size}`
        });
        return res.end();
      }

      /** Sending Partial Content With HTTP Code 206 */
      res.writeHead(206, {
        "Content-Range": `bytes ${start}-${end}/${size}`,
        "Accept-Ranges": "bytes",
        "Content-Length": end - start + 1,
        "Content-Type": "application/psdf",
        "Access-Control-Allow-Origin": "http://localhost:4200"
      });

      const readable = createReadStream(samplePDF, { start, end });
      pipeline(readable, res, err => {
        console.log(err);
      });

    } else {

      res.writeHead(200, {
        "Access-Control-Expose-Headers": "Accept-Ranges",
        "Access-Control-Allow-Headers": "Accept-Ranges,range",
        "Accept-Ranges": "bytes",
        "Content-Length": size,
        "Content-Type": "application/pdf",
        "Access-Control-Allow-Origin": "http://localhost:4200"
      });

      if (req.method === "GET") {
          const readable = createReadStream(samplePDF);
          pipeline(readable, res, err => {
            console.log(err);
          });	   
      } else {
        res.end();
      }

    }
  })
  .listen(port, () => console.log("Running on 3000 port"));

采用Golang作为服务器

  • http.ServeFile() 本身就支持range request
  • 如果内容不是文件,则采用: http.ServeContent()
func ServeContent(w ResponseWriter, req *Request,
    name string, modtime time.Time, content io.ReadSeeker)

客户端实现

React PDF viewer

安装:

npm install pdfjs-dist@3.4.120
brew install pkg-config cairo pango // for Apple M1
npm install @react-pdf-viewer/core@3.12.0

React组件:

 <Viewer
         fileUrl={`${DOCUMENT_API_URL}/media/${pdfFile}/`}
         httpHeaders={{
             Authorization: `${AUTHENTICATION_API_AUTHORIZATION_HEADER} ${accessToken}`,
             [AUTHENTICATION_API_HEADER]: AUTHENTICATION_API_TOKEN,
         }}
         transformGetDocumentParams={(options) => 
             Object.assign({}, options, {
                 disableRange: false,
                 disableStream: true,
                 disableAutoFetch: false,
             })
         }
         plugins={[defaultLayoutPluginInstance]}
         theme="dark"
         renderError={renderError}
/>

C#

FileStream stream = new FileStream(HttpContext.Current.Server.MapPath(“~/Data/F# Succinctly.pdf”), FileMode.Open);
helper.Load(stream);
Copy
The following are the list of Load() APIs available in the PDF Viewer Web platform:

Load(byte[] byteArray)
Load(Stream stream)
Load(string filePath)
Load(Syncfusion.Pdf.Parsing.PdfLoadedDocument lodedDocument)
Load(byte[] byteArray, string password)
Load(Stream stream, string password)
Load(string filePath, string password)

参考文档

  • streaming-a-pdf-from-the-web
  • 客户端采用Angular实现
  • https://community.safe.com/s/article/using-the-data-streaming-service-to-stream-pdf
  • https://github.com/mozilla/pdf.js/issues/8897
  • https://www.cnblogs.com/zwbsoft/p/13280672.html
  • https://github.com/citeccyr/pdf-stream
  • https://react-pdf-viewer.dev/docs/getting-started/
  • How to handle Partial Content in Node.js
  • 商业例子: https://showcase.apryse.com
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

北极象

如果觉得对您有帮助,鼓励一下

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值