ASP.NET(C#)读取word问题
这几天想在asp.net页面中,显示word文档的内容,试了几种方法,都得不到满意的结果,用streamreader和filestream来读的时候,不管用哪种编码形式,...
这几天想在asp.net页面中,显示word文档的内容,试了几种方法,都得不到满意的结果,用streamreader和filestream来读的时候,不管用哪种编码形式,页面都存在乱码。
后来有人说用com组件来读,我也试了,的确不存在乱码了,但是word中的格式全部没有了,页面中密密麻麻的都是文字,一点空隙都没有。请问该怎样读取word文档,并且让它的格式保留下来?
下面是我用com组件读取时写的一个函数来实现的代码,麻烦高手看看要怎样修改才能保留它的格式,或者如果有更好的方法来读取,麻烦你们教教我,谢谢!
public string Doc2Text(string docFileName)
{
Microsoft.Office.Interop.Word.ApplicationClass wordApp = new Microsoft.Office.Interop.Word.ApplicationClass();
object fileobj = docFileName;
object nullobj = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(ref fileobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj
);
string outText = doc.Content.Paragraphs;
doc.Close(ref nullobj, ref nullobj, ref nullobj);
wordApp.Quit(ref nullobj, ref nullobj, ref nullobj);
return outText;
}
一楼和二楼的朋友,能说的具体点吗? 展开
后来有人说用com组件来读,我也试了,的确不存在乱码了,但是word中的格式全部没有了,页面中密密麻麻的都是文字,一点空隙都没有。请问该怎样读取word文档,并且让它的格式保留下来?
下面是我用com组件读取时写的一个函数来实现的代码,麻烦高手看看要怎样修改才能保留它的格式,或者如果有更好的方法来读取,麻烦你们教教我,谢谢!
public string Doc2Text(string docFileName)
{
Microsoft.Office.Interop.Word.ApplicationClass wordApp = new Microsoft.Office.Interop.Word.ApplicationClass();
object fileobj = docFileName;
object nullobj = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(ref fileobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj
);
string outText = doc.Content.Paragraphs;
doc.Close(ref nullobj, ref nullobj, ref nullobj);
wordApp.Quit(ref nullobj, ref nullobj, ref nullobj);
return outText;
}
一楼和二楼的朋友,能说的具体点吗? 展开
4个回答
推荐于2016-12-01
展开全部
引用Word的Com组件Microsoft Word 11.0 Object Library,我的office2003的版本是8.3,默认安装的Office是没有这个组件的。用office盘,添加删除组件,选自定义,在Microsoft Office Word下面有.NET可编程性支持。安装。
引用到项目中Web.config多了这行
<compilation debug="false">
<assemblies>
<add assembly="Microsoft.Office.Interop.Word, Version=11.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"/></assemblies>
程序代码:
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.ApplicationClass();
//Word.ApplicationClass word = new Word.ApplicationClass();
Type wordType = word.GetType();
Microsoft.Office.Interop.Word.Documents docs = word.Documents;
// 打开文件
Type docsType = docs.GetType();
object fileName = "e:\\cc.doc";
Microsoft.Office.Interop.Word.Document doc = (Microsoft.Office.Interop.Word.Document)docsType.InvokeMember("Open",
System.Reflection.BindingFlags.InvokeMethod, null, (object)docs, new Object[] { fileName, true, true });
// 转换格式,另存为
Type docType = doc.GetType();
object saveFileName = "e:\\aaa.html";
//下面是Microsoft Word 9(11.0) Object Library的写法,如果是10(没试过),可能写成:
/*
docType.InvokeMember("SaveAs", System.Reflection.BindingFlags.InvokeMethod,
null, doc, new object[]{saveFileName, Word.WdSaveFormat.wdFormatFilteredHTML});
*/
///其它格式:
///wdFormatHTML
///wdFormatDocument
///wdFormatDOSText
///wdFormatDOSTextLineBreaks
///wdFormatEncodedText
///wdFormatRTF
///wdFormatTemplate
///wdFormatText
///wdFormatTextLineBreaks
///wdFormatUnicodeText
docType.InvokeMember("SaveAs", System.Reflection.BindingFlags.InvokeMethod,
null, doc, new object[] { saveFileName, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatHTML });
// 退出 Word
wordType.InvokeMember("Quit", System.Reflection.BindingFlags.InvokeMethod,
null, word, null);
引用到项目中Web.config多了这行
<compilation debug="false">
<assemblies>
<add assembly="Microsoft.Office.Interop.Word, Version=11.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"/></assemblies>
程序代码:
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.ApplicationClass();
//Word.ApplicationClass word = new Word.ApplicationClass();
Type wordType = word.GetType();
Microsoft.Office.Interop.Word.Documents docs = word.Documents;
// 打开文件
Type docsType = docs.GetType();
object fileName = "e:\\cc.doc";
Microsoft.Office.Interop.Word.Document doc = (Microsoft.Office.Interop.Word.Document)docsType.InvokeMember("Open",
System.Reflection.BindingFlags.InvokeMethod, null, (object)docs, new Object[] { fileName, true, true });
// 转换格式,另存为
Type docType = doc.GetType();
object saveFileName = "e:\\aaa.html";
//下面是Microsoft Word 9(11.0) Object Library的写法,如果是10(没试过),可能写成:
/*
docType.InvokeMember("SaveAs", System.Reflection.BindingFlags.InvokeMethod,
null, doc, new object[]{saveFileName, Word.WdSaveFormat.wdFormatFilteredHTML});
*/
///其它格式:
///wdFormatHTML
///wdFormatDocument
///wdFormatDOSText
///wdFormatDOSTextLineBreaks
///wdFormatEncodedText
///wdFormatRTF
///wdFormatTemplate
///wdFormatText
///wdFormatTextLineBreaks
///wdFormatUnicodeText
docType.InvokeMember("SaveAs", System.Reflection.BindingFlags.InvokeMethod,
null, doc, new object[] { saveFileName, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatHTML });
// 退出 Word
wordType.InvokeMember("Quit", System.Reflection.BindingFlags.InvokeMethod,
null, word, null);
展开全部
COM是可以获取格式的,包括段落、字体、样式等,问题是你要如何使用这些格式?比如你要把WORD输出成HTML,那就要自己读取并把它们处理成HTML代码。
另一种方法是把DOC保存为XML格式,这样就可以当做XML来处理了,不过目前还没看到有处理XMLDOC的XSLT
另一种方法是把DOC保存为XML格式,这样就可以当做XML来处理了,不过目前还没看到有处理XMLDOC的XSLT
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
展开全部
可以使用Aspose.Words控件,将其转换成html文件
Document doc = new Document("Document.docx");
doc.Save("Document.ConvertToHtml Out.html",Aspose.Words.SaveFormat.Html);
Document doc = new Document("Document.docx");
doc.Save("Document.ConvertToHtml Out.html",Aspose.Words.SaveFormat.Html);
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
展开全部
第一种方法:
代码如下:
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "Application/msword";
string s=Server.MapPath("C#语言参考.doc");
Response.WriteFile("C#语言参考.doc");
Response.Write(s);
Response.Flush();
Response.Close();
第二种方法:
代码如下:
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "Application/msword";
string strFilePath="";
strFilePath =Server.MapPath("C#语言参考.doc");
FileStream fs = new FileStream(strFilePath,FileMode.OpenOrCreate,FileAccess.Read);
Response.WriteFile(strFilePath,0,fs.Length);
fs.Close();
第三种方法:
复制代码 代码如下:
string path=Server.MapPath("C#语言参考.doc");
FileInfo file=new FileInfo(path);
FileStream myfileStream=new FileStream(path,FileMode.Open,FileAccess.Read);
byte[] filedata=new Byte[file.Length];
myfileStream.Read(filedata,0,(int)(file.Length));
myfileStream.Close();
Response.Clear();
Response.ContentType="application/msword";
Response.AddHeader("Content-Disposition","attachment;filename=文件名.doc");
Response.Flush();
Response.BinaryWrite(filedata);
Response.End();
代码如下:
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "Application/msword";
string s=Server.MapPath("C#语言参考.doc");
Response.WriteFile("C#语言参考.doc");
Response.Write(s);
Response.Flush();
Response.Close();
第二种方法:
代码如下:
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "Application/msword";
string strFilePath="";
strFilePath =Server.MapPath("C#语言参考.doc");
FileStream fs = new FileStream(strFilePath,FileMode.OpenOrCreate,FileAccess.Read);
Response.WriteFile(strFilePath,0,fs.Length);
fs.Close();
第三种方法:
复制代码 代码如下:
string path=Server.MapPath("C#语言参考.doc");
FileInfo file=new FileInfo(path);
FileStream myfileStream=new FileStream(path,FileMode.Open,FileAccess.Read);
byte[] filedata=new Byte[file.Length];
myfileStream.Read(filedata,0,(int)(file.Length));
myfileStream.Close();
Response.Clear();
Response.ContentType="application/msword";
Response.AddHeader("Content-Disposition","attachment;filename=文件名.doc");
Response.Flush();
Response.BinaryWrite(filedata);
Response.End();
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询