文章/答案/技术大牛

发布

社区首页 >问答首页 >在SQL数据库中从DocIds和FileNames中记录所有dtSearch和FileNames的最快方法

问在SQL数据库中从DocIds和FileNames中记录所有dtSearch和FileNames的最快方法
EN

Stack Overflow用户

提问于 2015-09-02 03:14:24

回答 4查看 679关注 0票数 3

我在与SQL数据库结合使用dtSearch，并希望维护一个包含所有DocIds及其相关FileNames的表。从那里，我将添加一个列与我的外键，使我可以结合文本和数据库搜索。

我有代码可以简单地返回索引中的所有记录，并将它们逐一添加到DB中。然而，这需要花费很长时间，并且没有解决如何在将新记录添加到索引时简单地追加这些记录的问题。但以防万一：

MyDatabaseContext db = new StateScapeEntities();
IndexJob ij = new dtSearch.Engine.IndexJob();

ij.IndexPath = @"d:\myindex";

IndexInfo indexInfo = dtSearch.Engine.IndexJob.GetIndexInfo(@"d:\myindex");

bool jobDone =   ij.Execute();

SearchResults sr = new SearchResults();

uint n = indexInfo.DocCount;

for (int i = 1; i <= n; i++)
{
    sr.AddDoc(ij.IndexPath, i, null);
}

for (int i = 1; i <= n; i++)
{
    sr.GetNthDoc(i - 1);
        //IndexDocument is defined elsewhere
        IndexDocument id = new IndexDocument();
        id.DocId = sr.CurrentItem.DocId;
        id.FilePath = sr.CurrentItem.Filename;

        if (id.FilePath != null)
        {
            db.IndexDocuments.Add(id);
            db.SaveChanges();           
        }   
}

sql-server

dtsearch

回答 4

Stack Overflow用户

回答已采纳

发布于 2016-03-22 16:37:12

所以，我使用了user2172986的部分响应，但是结合了一些额外的代码来解决我的问题。我确实必须在索引更新例程中设置dtsKeepExistingDocIds标志。从此，我只想将新创建的DocIds添加到我的SQL数据库中。为此，我使用了以下代码：

string indexPath = @"d:\myindex"; 

        using (IndexJob ij = new dtSearch.Engine.IndexJob())
        {
            //make sure the updated index doesn't change DocIds
            ij.IndexingFlags = IndexingFlags.dtsIndexKeepExistingDocIds;
            ij.IndexPath = indexPath;
            ij.ActionAdd = true;
            ij.FoldersToIndex.Add( indexPath + "<+>");
            ij.IncludeFilters.Add( "*");
            bool jobDone = ij.Execute();
        }
        //create a DataTable to hold results
        DataTable newIndexDoc = MakeTempIndexDocTable(); //this is a custom method not included in this example; just creates a DataTable with the appropriate columns

        //connect to the DB;
        MyDataBase db = new MyDataBase(); //again, custom code not included - link to EntityFramework entity

        //get the last DocId in the DB?
        int lastDbDocId = db.IndexDocuments.OrderByDescending(i => i.DocId).FirstOrDefault().DocId;

        //get the last DocId in the Index
        IndexInfo indexInfo = dtSearch.Engine.IndexJob.GetIndexInfo(indexPath);

        uint latestIndexDocId = indexInfo.LastDocId;

        //create a searchFilter
        dtSearch.Engine.SearchFilter sf = new SearchFilter();

        int indexId = sf.AddIndex(indexPath);


        //only select new records (from one greater than the last DocId in the DB to the last DocId in the index itself
        sf.SelectItems(indexId, lastDbDocId + 1, int.Parse(latestIndexDocId.ToString()), true);

        using (SearchJob sj = new dtSearch.Engine.SearchJob())
        {
           sj.SetFilter(sf);
           //return every document in the specified range (using xfirstword)
           sj.Request = "xfirstword";
           // Specify the path to the index to search here
           sj.IndexesToSearch.Add(indexPath);


          //additional flags and limits redacted for clarity

           sj.Execute();

           // Store the error message in the status
           //redacted for clarity



           SearchResults results = sj.Results;
           int startIdx = 0;
           int endIdx = results.Count;
           if (startIdx==endIdx)
               return;


           for (int i = startIdx; i < endIdx; i++)
           {
               results.GetNthDoc(i);

               IndexDocument id = new IndexDocument();
               id.DocId = results.CurrentItem.DocId;
               id.FileName= results.CurrentItem.Filename;

               if (id.FileName!= null)
               {

                   DataRow row = newIndexDoc.NewRow();

                   row["DocId"] = id.DocId;
                   row["FileName"] = id.FileName;

                   newIndexDoc.Rows.Add(row);
               }


           }

           newIndexDoc.AcceptChanges();

           //SqlBulkCopy
           using (SqlConnection connection =
                  new SqlConnection(db.Database.Connection.ConnectionString))
           {
               connection.Open();

               using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
               {
                   bulkCopy.DestinationTableName =
                       "dbo.IndexDocument";

                   try
                   {
                       // Write from the source to the destination.
                       bulkCopy.WriteToServer(newIndexDoc);
                   }
                   catch (Exception ex)
                   {
                       Console.WriteLine(ex.Message);
                   }
               }
           }

           newIndexDoc.Clear();
           db.UpdateIndexDocument();
        }

票数 1

Stack Overflow用户

发布于 2015-12-17 14:13:44

若要将DocId保留在索引中，必须在IndexJob中使用标志dtsIndexKeepExistingDocIds。

您还可以查看dtSearch文本检索引擎程序员在更改DocID时的参考

当文档被添加到索引中时，它会被分配一个DocId，并且DocIds总是按顺序编号。
当文档被重新索引时，旧的DocId被取消，一个新的DocId被分配。
当一个索引被压缩时，索引中的所有DocIds都重新编号以删除取消的DocIds，除非在IndexJob中设置了dtsIndexKeepExistingDocIds标志。
当一个索引被合并到另一个索引中时，目标索引中的DocIds永远不会更改。合并到目标索引中的文档都将被分配给新的、顺序编号的DocIds，除非(a)在IndexJob中设置了dtsIndexKeepExistingDocIds标志，以及(b)索引具有不重叠的文档ids范围。

票数 2

Stack Overflow用户

发布于 2015-12-17 14:05:30

为了提高速度，您可以搜索单词“xfirstword word”，并在索引中获取所有文档。

您还可以查看常见问题如何检索索引中的所有文档。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/32344084

复制

相似问题

问在SQL数据库中从DocIds和FileNames中记录所有dtSearch和FileNames的最快方法
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在SQL数据库中从DocIds和FileNames中记录所有dtSearch和FileNames的最快方法EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在SQL数据库中从DocIds和FileNames中记录所有dtSearch和FileNames的最快方法
EN